How to Use Category Folders to Improve Document Categorization
Introduction
This guide provides practical, actionable steps for using category folders to improve how Sky AI categorizes your documents. While folders are optional, they can significantly enhance categorization accuracy when used strategically.
Who This Guide Is For: Organization administrators, case managers, and anyone responsible for setting up or maintaining document categories in Sky AI.
What You'll Learn:
- How folder organization directly impacts AI categorization accuracy
- Step-by-step strategies for structuring folders effectively
- Real-world examples and before/after scenarios
- How to test and validate your improvements
- Common mistakes to avoid
Understanding How Folders Improve Categorization
The AI Categorization Process
When Sky AI processes a document, it goes through these steps:
- Content Extraction: The AI reads the document text and identifies key information
- Category Comparison: The AI compares the document against all available category definitions
- Best Match Selection: The AI assigns the document to the category with the strongest match
- Section Assignment: Individual sections within the document are assigned to categories
The AI uses several pieces of information from each category to make its decision:
- Definition: What the category represents
- Expected Content: What information should be in documents of this type
- Recognition Tips: Specific phrases, formatting, or visual cues
- Key Topics: Main subjects covered in this category
- Naming Convention: How documents are typically named
How Folders Help
Category folders improve categorization in three key ways:
1. Cognitive Organization for the AI
When categories are grouped into folders, the AI can use the folder context as an additional signal. For example, if a document contains medical terminology, the AI can prioritize categories within the "Medical Documents" folder over categories in the "Legal Documents" folder.
2. Reduced Ambiguity
Without folders, you might have categories like "Reports" in both medical and legal contexts. With folders, you can have:
- Medical Documents โ Medical Reports
- Legal Documents โ Legal Reports
This clear separation helps the AI distinguish between similar document types in different contexts.
3. Easier Human Review and Refinement
When categories are well-organized in folders, it's easier for you to:
- Review categorization accuracy by domain
- Identify missing categories within a specific area
- Spot overlapping or redundant categories
- Train new team members on the categorization system
Step-by-Step Strategy: Setting Up Folders for Better Categorization
Step 1: Audit Your Current Categories
Before creating folders, understand what you're working with.

Action Items:
- Export or list all your current organization-level categories
- Group them by theme on paper or in a spreadsheet
- Identify categories that are:
- Similar in name but different in purpose
- Overlapping in scope
- Rarely used or never assigned
- Missing clear definitions
Example Audit:
Current Categories (Unorganized): - Medical Records - Police Report - Treatment Plan - Legal Letter - Invoice - IME Report - Correspondence - Clinical Notes - Witness Statement - Payment Record - Diagnostic Test - Court Document
After Grouping:
Medical-Related: - Medical Records - Treatment Plan - IME Report - Clinical Notes - Diagnostic Test Legal-Related: - Police Report - Legal Letter - Witness Statement - Court Document - Correspondence (ambiguous - could be medical or legal) Financial-Related: - Invoice - Payment Record
Step 2: Design Your Folder Structure
Create 3-7 main folders that represent the major domains in your work.

Guidelines:
- Keep it simple: 3-7 folders is ideal. More than 10 becomes unwieldy
- Use clear, distinct names: Avoid overlapping folder purposes
- Think about your workflow: Organize by how your team thinks about documents
- Consider your industry: Use terminology familiar to your team
Example Folder Structures by Industry:
Insurance Claims (Personal Injury):
๐ Medical Documentation ๐ Legal & Liability ๐ Financial & Billing ๐ Investigation & Evidence ๐ Administrative
Legal Practice (General):
๐ Pleadings & Motions ๐ Discovery Materials ๐ Evidence & Exhibits ๐ Correspondence ๐ Client Documents
Healthcare (Patient Records):
๐ Clinical Records ๐ Diagnostic & Lab Results ๐ Treatment & Care Plans ๐ Administrative & Billing ๐ External Reports
Step 3: Assign Categories to Folders
Move your categories into the appropriate folders.

Action Items:
- Start with your most-used categories first
- Assign each category to exactly one folder
- Rename categories if needed to avoid confusion within folders
- Leave the fallback/miscellaneous category in a general folder or unfiled
Before and After Example:
Before (No Folders):
Categories: - Medical Records - Medical Reports - Doctor's Notes - Clinical Notes - Treatment Records - Treatment Plans - Police Report - Police Records - Legal Correspondence - Legal Letters
After (With Folders):
๐ Medical Documentation - Medical Records (patient history, charts) - Clinical Notes (doctor's observations) - Treatment Plans (proposed care) ๐ Medical Assessments - IME Reports (independent medical exams) - Diagnostic Results (lab tests, imaging) ๐ Legal & Investigation - Police Reports (accident reports, investigations) - Legal Correspondence (attorney letters) - Witness Statements
Notice: We consolidated similar categories (Medical Reports โ Medical Records, Doctor's Notes โ Clinical Notes) and clarified the distinction between remaining categories.
Step 4: Enhance Category Definitions
With folders in place, refine your category definitions to work with the folder context.

For Each Category, Ensure You Have:
1. Clear Definition (1-3 sentences)
Good: "Medical records documenting a patient's health history, including diagnoses, treatments, medications, and clinical observations from healthcare providers." Bad: "Medical stuff"
2. Specific Recognition Tips
Good: "Look for medical terminology, provider names, patient identifiers, vital signs, diagnosis codes (ICD-10), and clinical observations. Often includes letterhead from hospitals, clinics, or medical practices." Bad: "Medical documents from doctors"
3. Expected Content
Good: "Patient demographics, chief complaint, history of present illness, past medical history, medications, allergies, physical examination findings, assessment and diagnosis, treatment plan." Bad: "Medical information"
4. Key Topics
Good: "diagnosis, treatment, medications, vital signs, symptoms, medical history, physical examination, clinical assessment" Bad: "health, medical"
5. Naming Convention (if applicable)
Good: "MedicalRecord_[PatientLastName]_[Date]_[ProviderName]" Bad: "medical_doc"
Step 5: Test with Real Documents
Upload sample documents and evaluate categorization accuracy.

Action Items:
- Create a test case with 20-30 representative documents
- Upload documents and let the AI categorize them
- Review the results and note:
- Correctly categorized documents
- Incorrectly categorized documents
- Documents assigned to the fallback category
- Identify patterns in the errors
Testing Checklist:
โ Upload at least 3 documents for each major category โ Include edge cases (documents that could fit multiple categories) โ Include documents with poor quality scans or handwriting โ Include documents in different formats (typed, handwritten, forms) โ Review categorization within 24 hours of upload
Example Test Results:
Test Case: Personal Injury Claim #12345 Correctly Categorized (18/30): โ Medical Records: 5/5 correct โ Police Reports: 3/3 correct โ Treatment Plans: 4/5 correct (1 went to Medical Records) โ Invoices: 3/3 correct โ Legal Correspondence: 3/4 correct (1 went to Miscellaneous) Incorrectly Categorized (12/30): โ IME Report โ Medical Records (should be Medical Assessments) โ Witness Statement โ Miscellaneous (should be Legal & Investigation) โ Clinical Notes โ Treatment Plans (should be Medical Documentation)
Step 6: Refine Based on Results
Use your test results to improve category definitions and folder organization.
Common Issues and Solutions:
Issue 1: Documents Going to Wrong Category Within Same Folder
Example: Treatment Plans being categorized as Medical Records
Solution: Strengthen the distinction in definitions and recognition tips.
Before: Medical Records: "Documents about patient health" Treatment Plans: "Plans for patient care" After: Medical Records: "Historical documentation of past medical visits, diagnoses, and treatments already provided. Look for past-tense language and completed visit summaries." Treatment Plans: "Forward-looking documents proposing future care, including recommended treatments, therapy schedules, and anticipated outcomes. Look for future-tense language and treatment recommendations."
Issue 2: Documents Going to Fallback/Miscellaneous
Example: Witness Statements not being recognized
Solution: Add more specific recognition tips and key topics.
Before: Witness Statement: "Statements from witnesses" Recognition Tips: "Witness accounts" After: Witness Statement: "First-hand accounts from individuals who observed an incident, typically used in legal proceedings or insurance investigations." Recognition Tips: "Look for phrases like 'I witnessed', 'I observed', 'I saw', sworn statements, notarized signatures, witness contact information, and narrative descriptions of events. May include affidavits or statutory declarations." Key Topics: "eyewitness, observation, incident description, sworn statement, affidavit, testimony, account of events"
Issue 3: Similar Documents Going to Different Categories
Example: Some medical assessments going to Medical Records, others to IME Reports
Solution: Consolidate categories or create clearer boundaries.
Option A - Consolidate: Merge "IME Reports" into "Medical Assessments" and update definition to include both independent exams and specialist assessments. Option B - Clarify Boundaries: IME Reports: "Independent medical examinations conducted by third-party physicians for legal or insurance purposes. Look for phrases like 'independent medical examination', 'IME', 'at the request of', and evaluator credentials." Medical Assessments: "Specialist evaluations and diagnostic assessments from treating physicians. Look for detailed diagnostic findings, specialist opinions, and treatment recommendations from ongoing care providers."
Step 7: Implement and Monitor
Roll out your improved folder structure and monitor ongoing performance.
Action Items:
- Update all organization-level categories with improved definitions
- Communicate changes to your team
- Monitor categorization accuracy on new cases for 2-4 weeks
- Collect feedback from case managers
- Make iterative improvements
Monitoring Metrics:
Track these metrics to measure improvement:
- Categorization Accuracy: % of documents correctly categorized on first pass
- Fallback Rate: % of documents going to Miscellaneous/fallback category
- Manual Recategorization Rate: % of documents manually moved by users
- Category Usage: Which categories are used most/least frequently
- User Satisfaction: Feedback from team members on ease of finding documents
Example Monitoring Dashboard:
Week 1 (Before Folders): - Categorization Accuracy: 68% - Fallback Rate: 22% - Manual Recategorization: 31% Week 4 (After Folders + Refinement): - Categorization Accuracy: 87% - Fallback Rate: 8% - Manual Recategorization: 12% Improvement: +19% accuracy, -14% fallback rate, -19% manual work
Real-World Example: Insurance Claims Organization
Scenario
An insurance company processes accident benefit claims with hundreds of documents per case. They were experiencing:
- 35% of documents going to the wrong category
- 25% of documents going to "Miscellaneous"
- Case managers spending 2-3 hours per case manually reorganizing documents
Before: No Folder Structure
Categories (Flat List): 1. Medical Records 2. Reports 3. Forms 4. Letters 5. Bills 6. Photos 7. Documents 8. Other 9. Miscellaneous
Problems:
- "Reports" was too vague (medical reports? police reports? investigator reports?)
- "Letters" could be legal, medical, or administrative
- "Documents" was meaningless
- Definitions were generic and unhelpful to the AI
After: Folder Structure Implementation
Step 1: Audit and Regroup
They identified 6 major document domains in their workflow:
- Medical documentation
- Legal and liability
- Financial and billing
- Investigation and evidence
- Administrative and forms
- Communication
Step 2: Create Folder Structure
๐ Medical Documentation - Medical Records (patient charts, visit summaries) - Treatment Plans (proposed care, therapy schedules) - IME Reports (independent medical examinations) - Clinical Notes (doctor's observations, progress notes) - Diagnostic Results (lab tests, imaging, x-rays) - Prescriptions (medication orders, pharmacy records) ๐ Legal & Liability - Police Reports (accident reports, collision statements) - Legal Correspondence (attorney letters, legal notices) - Witness Statements (eyewitness accounts, affidavits) - Court Documents (filings, orders, judgments) - Liability Assessments (fault determinations) ๐ Financial & Billing - Medical Invoices (hospital bills, provider charges) - Payment Records (EOBs, payment confirmations) - Benefit Calculations (coverage determinations) - Receipts (out-of-pocket expenses) ๐ Investigation & Evidence - Accident Scene Photos (collision images, damage photos) - Surveillance Reports (investigator observations) - Vehicle Damage Assessments (repair estimates, appraisals) - Incident Reports (employer reports, safety reports) ๐ Forms & Applications - Claim Forms (initial applications, OCF forms) - Authorization Forms (consent, release forms) - Questionnaires (claimant statements, health history) - Disability Certificates (functional assessments) ๐ Administrative - Correspondence (general letters, notifications) - Case Notes (internal notes, file memos) - Miscellaneous (fallback category)
Step 3: Enhanced Definitions
Example for "Medical Records":
Definition: "Comprehensive documentation of a patient's medical history, including past and current diagnoses, treatments received, medications, allergies, and clinical observations from healthcare providers. These are historical records of care already provided, not future treatment plans." Expected Content: "Patient demographics, medical history, chief complaints, physical examination findings, vital signs, diagnoses (ICD-10 codes), treatments administered, medications prescribed, allergies, immunization records, provider notes, visit summaries, discharge summaries." Recognition Tips: "Look for medical terminology, provider letterhead (hospitals, clinics, physician offices), patient identifiers, vital signs (blood pressure, temperature, heart rate), diagnosis codes, medication lists, and clinical observations. Often includes headers like 'Medical History', 'Patient Chart', 'Clinical Summary', or 'Medical Report'. May contain multiple visits over time." Key Topics: "medical history, diagnosis, treatment, medications, vital signs, clinical observations, patient care, healthcare provider, hospital records, physician notes, allergies, immunizations" Naming Convention: "MedicalRecord_[PatientName]_[Date]_[Provider]"
Step 4: Testing Results
They tested with 50 documents from 5 recent cases:
Initial Test Results: - Categorization Accuracy: 76% (38/50 correct) - Fallback Rate: 14% (7/50 to Miscellaneous) - Common Errors: * IME Reports โ Medical Records (3 cases) * Legal Correspondence โ Administrative (2 cases) * Treatment Plans โ Medical Records (2 cases)
Step 5: Refinements
Based on errors, they:
- Strengthened IME Reports recognition tips to emphasize "independent" and "third-party"
- Added specific legal phrases to Legal Correspondence recognition tips
- Clarified Medical Records as "historical" vs Treatment Plans as "future-looking"
Step 6: Results After Refinement
After 4 weeks of monitoring:
Final Results: - Categorization Accuracy: 91% (up from 65% before folders) - Fallback Rate: 6% (down from 25%) - Manual Recategorization: 9% (down from 35%) - Time Savings: 1.5 hours per case (down from 2-3 hours) Annual Impact: - 500 cases/year ร 1.5 hours saved = 750 hours saved - At $50/hour = $37,500 in labor savings - Plus improved consistency and faster case processing
Advanced Strategies
Strategy 1: Use Folder Names in Recognition Tips
Reference the folder context in your category definitions to reinforce the AI's understanding.
Example:
Category: Clinical Notes Folder: Medical Documentation Recognition Tips: "Within medical documentation, clinical notes are the provider's real-time observations during patient encounters. Look for narrative descriptions of symptoms, physical examination findings, and clinical impressions. Unlike formal medical records or treatment plans, these are often less structured and more conversational in tone."
Strategy 2: Create Parallel Folder Structures for Different Case Types
If you handle multiple types of cases, create separate folder structures for each.
Example:
For Personal Injury Cases: ๐ Medical Documentation ๐ Legal & Liability ๐ Financial & Billing ๐ Investigation & Evidence For Workers Compensation Cases: ๐ Medical Documentation ๐ Employment Records ๐ Workplace Safety ๐ Benefits & Claims ๐ Legal & Appeals
You can use the same category names in different folders, and the folder context helps the AI distinguish them.
Strategy 3: Use Folder Colors Consistently
Assign consistent colors to folders to create visual patterns that help both AI and human users.
Example:
๐ Medical Documentation (Blue) - All medical categories use shades of blue ๐ Legal & Liability (Red) - All legal categories use shades of red ๐ Financial & Billing (Green) - All financial categories use shades of green
Strategy 4: Create a "Staging" Folder for Ambiguous Categories
If you have categories that don't fit neatly into your main folders, create a temporary staging folder.
Example:
๐ Medical Documentation ๐ Legal & Liability ๐ Financial & Billing ๐ Investigation & Evidence ๐ Administrative ๐ Under Review (Staging) - Correspondence (could be medical, legal, or administrative) - General Reports (needs further classification) - Miscellaneous (fallback)
Over time, you can split these ambiguous categories into more specific ones or merge them into existing folders.
Common Mistakes to Avoid
Mistake 1: Too Many Folders
Problem: Creating 15+ folders makes the system overwhelming and defeats the purpose of organization.
Solution: Stick to 3-7 main folders. If you need more granularity, use subcategories within folders, not more folders.
Mistake 2: Overlapping Folder Purposes
Problem: Creating folders like "Medical Documents" and "Healthcare Records" that mean essentially the same thing.
Solution: Ensure each folder has a distinct purpose. If two folders seem similar, merge them.
Mistake 3: Generic Category Definitions
Problem: Definitions like "Documents related to medical stuff" don't help the AI.
Solution: Write specific, detailed definitions with concrete examples and recognition tips.
Mistake 4: Not Testing Before Rollout
Problem: Implementing a new folder structure across all cases without testing.
Solution: Always test with sample documents first, refine based on results, then roll out gradually.
Mistake 5: Setting and Forgetting
Problem: Creating folders once and never reviewing or updating them.
Solution: Monitor categorization accuracy regularly and make iterative improvements based on real-world performance.
Mistake 6: Ignoring User Feedback
Problem: Not listening to case managers who manually recategorize documents.
Solution: Track which documents are frequently moved and why, then update definitions accordingly.
Mistake 7: Over-Categorizing
Problem: Creating 50+ highly specific categories that confuse the AI and users.
Solution: Start with 10-20 broad categories and add more only when you see consistent patterns of documents that don't fit.
Measuring Success
Key Performance Indicators
Track these metrics to measure the impact of your folder organization:
1. Categorization Accuracy Rate
Formula: (Correctly Categorized Documents / Total Documents) ร 100 Target: >85% How to Measure: Manual review of sample cases
2. Fallback Category Rate
Formula: (Documents in Miscellaneous / Total Documents) ร 100 Target: <10% How to Measure: Query your system for fallback category usage
3. Manual Recategorization Rate
Formula: (Documents Manually Moved / Total Documents) ร 100 Target: <15% How to Measure: Track document category changes by users
4. Time to Organize Documents
Measure: Average time case managers spend organizing documents per case Target: <30 minutes per case How to Measure: User surveys or time tracking
5. User Satisfaction
Measure: Team feedback on ease of finding documents Target: >4/5 rating How to Measure: Regular surveys
Before and After Comparison
Document your baseline before implementing folders, then measure again after 4-6 weeks:
Baseline (Before Folders): - Categorization Accuracy: ____% - Fallback Rate: ____% - Manual Recategorization: ____% - Time per Case: ____ hours - User Satisfaction: ____/5 After Implementation (Week 6): - Categorization Accuracy: ____% - Fallback Rate: ____% - Manual Recategorization: ____% - Time per Case: ____ hours - User Satisfaction: ____/5 Improvement: - Accuracy: +____% - Fallback: -____% - Manual Work: -____% - Time Saved: ____ hours - Satisfaction: +____ points
Conclusion
Category folders are a powerful tool for improving document categorization in Sky AI. By following this step-by-step approach, you can:
- Reduce categorization errors by 20-30%
- Decrease fallback category usage by 15-20%
- Save 1-2 hours per case in manual organization time
- Improve team efficiency and satisfaction
Key Takeaways:
- Start with an audit of your current categories
- Design 3-7 clear folders that match your workflow
- Write specific, detailed definitions for each category
- Test thoroughly before rolling out
- Monitor and refine based on real-world performance
- Avoid common mistakes like over-categorizing or generic definitions
Remember: The goal isn't perfection on day one. It's continuous improvement. Start with a solid foundation, test with real documents, and refine based on results. Over time, your categorization system will become more accurate and your team will spend less time manually organizing documents.
Next Steps
- Schedule a planning session with your team to audit current categories
- Design your folder structure using the guidelines in this article
- Create a test case with 20-30 representative documents
- Implement and test your new folder structure
- Monitor results for 4-6 weeks
- Refine and optimize based on performance data
- Share best practices with your team
For additional support, refer to the companion guide "How Folders and Categories Work in Sky AI" for a deeper understanding of the underlying system.