How to Use Category Folders to Improve Document Categorization

Introduction

This guide provides practical, actionable steps for using category folders to improve how Sky AI categorizes your documents. While folders are optional, they can significantly enhance categorization accuracy when used strategically.


Who This Guide Is For: Organization administrators, case managers, and anyone responsible for setting up or maintaining document categories in Sky AI.


What You'll Learn:

  • How folder organization directly impacts AI categorization accuracy
  • Step-by-step strategies for structuring folders effectively
  • Real-world examples and before/after scenarios
  • How to test and validate your improvements
  • Common mistakes to avoid

Understanding How Folders Improve Categorization

The AI Categorization Process

When Sky AI processes a document, it goes through these steps:

  1. Content Extraction: The AI reads the document text and identifies key information
  2. Category Comparison: The AI compares the document against all available category definitions
  3. Best Match Selection: The AI assigns the document to the category with the strongest match
  4. Section Assignment: Individual sections within the document are assigned to categories

The AI uses several pieces of information from each category to make its decision:

  • Definition: What the category represents
  • Expected Content: What information should be in documents of this type
  • Recognition Tips: Specific phrases, formatting, or visual cues
  • Key Topics: Main subjects covered in this category
  • Naming Convention: How documents are typically named

How Folders Help

Category folders improve categorization in three key ways:

1. Cognitive Organization for the AI

When categories are grouped into folders, the AI can use the folder context as an additional signal. For example, if a document contains medical terminology, the AI can prioritize categories within the "Medical Documents" folder over categories in the "Legal Documents" folder.


2. Reduced Ambiguity

Without folders, you might have categories like "Reports" in both medical and legal contexts. With folders, you can have:

  • Medical Documents โ†’ Medical Reports
  • Legal Documents โ†’ Legal Reports

This clear separation helps the AI distinguish between similar document types in different contexts.


3. Easier Human Review and Refinement

When categories are well-organized in folders, it's easier for you to:

  • Review categorization accuracy by domain
  • Identify missing categories within a specific area
  • Spot overlapping or redundant categories
  • Train new team members on the categorization system

Step-by-Step Strategy: Setting Up Folders for Better Categorization

Step 1: Audit Your Current Categories

Before creating folders, understand what you're working with.


Action Items:

  1. Export or list all your current organization-level categories
  2. Group them by theme on paper or in a spreadsheet
  3. Identify categories that are:
    • Similar in name but different in purpose
    • Overlapping in scope
    • Rarely used or never assigned
    • Missing clear definitions

Example Audit:

Current Categories (Unorganized):
- Medical Records
- Police Report
- Treatment Plan
- Legal Letter
- Invoice
- IME Report
- Correspondence
- Clinical Notes
- Witness Statement
- Payment Record
- Diagnostic Test
- Court Document

After Grouping:

Medical-Related:
- Medical Records
- Treatment Plan
- IME Report
- Clinical Notes
- Diagnostic Test

Legal-Related:
- Police Report
- Legal Letter
- Witness Statement
- Court Document
- Correspondence (ambiguous - could be medical or legal)

Financial-Related:
- Invoice
- Payment Record

Step 2: Design Your Folder Structure

Create 3-7 main folders that represent the major domains in your work.


Guidelines:

  • Keep it simple: 3-7 folders is ideal. More than 10 becomes unwieldy
  • Use clear, distinct names: Avoid overlapping folder purposes
  • Think about your workflow: Organize by how your team thinks about documents
  • Consider your industry: Use terminology familiar to your team

Example Folder Structures by Industry:


Insurance Claims (Personal Injury):

๐Ÿ“ Medical Documentation
๐Ÿ“ Legal & Liability
๐Ÿ“ Financial & Billing
๐Ÿ“ Investigation & Evidence
๐Ÿ“ Administrative

Legal Practice (General):

๐Ÿ“ Pleadings & Motions
๐Ÿ“ Discovery Materials
๐Ÿ“ Evidence & Exhibits
๐Ÿ“ Correspondence
๐Ÿ“ Client Documents

Healthcare (Patient Records):

๐Ÿ“ Clinical Records
๐Ÿ“ Diagnostic & Lab Results
๐Ÿ“ Treatment & Care Plans
๐Ÿ“ Administrative & Billing
๐Ÿ“ External Reports

Step 3: Assign Categories to Folders

Move your categories into the appropriate folders.


Action Items:

  1. Start with your most-used categories first
  2. Assign each category to exactly one folder
  3. Rename categories if needed to avoid confusion within folders
  4. Leave the fallback/miscellaneous category in a general folder or unfiled

Before and After Example:


Before (No Folders):

Categories:
- Medical Records
- Medical Reports
- Doctor's Notes
- Clinical Notes
- Treatment Records
- Treatment Plans
- Police Report
- Police Records
- Legal Correspondence
- Legal Letters

After (With Folders):

๐Ÿ“ Medical Documentation
   - Medical Records (patient history, charts)
   - Clinical Notes (doctor's observations)
   - Treatment Plans (proposed care)
   
๐Ÿ“ Medical Assessments
   - IME Reports (independent medical exams)
   - Diagnostic Results (lab tests, imaging)
   
๐Ÿ“ Legal & Investigation
   - Police Reports (accident reports, investigations)
   - Legal Correspondence (attorney letters)
   - Witness Statements

Notice: We consolidated similar categories (Medical Reports โ†’ Medical Records, Doctor's Notes โ†’ Clinical Notes) and clarified the distinction between remaining categories.


Step 4: Enhance Category Definitions

With folders in place, refine your category definitions to work with the folder context.


For Each Category, Ensure You Have:


1. Clear Definition (1-3 sentences)

Good: "Medical records documenting a patient's health history, including 
diagnoses, treatments, medications, and clinical observations from 
healthcare providers."

Bad: "Medical stuff"

2. Specific Recognition Tips

Good: "Look for medical terminology, provider names, patient identifiers, 
vital signs, diagnosis codes (ICD-10), and clinical observations. Often 
includes letterhead from hospitals, clinics, or medical practices."

Bad: "Medical documents from doctors"

3. Expected Content

Good: "Patient demographics, chief complaint, history of present illness, 
past medical history, medications, allergies, physical examination findings, 
assessment and diagnosis, treatment plan."

Bad: "Medical information"

4. Key Topics

Good: "diagnosis, treatment, medications, vital signs, symptoms, medical 
history, physical examination, clinical assessment"

Bad: "health, medical"

5. Naming Convention (if applicable)

Good: "MedicalRecord_[PatientLastName]_[Date]_[ProviderName]"

Bad: "medical_doc"

Step 5: Test with Real Documents

Upload sample documents and evaluate categorization accuracy.


Action Items:

  1. Create a test case with 20-30 representative documents
  2. Upload documents and let the AI categorize them
  3. Review the results and note:
    • Correctly categorized documents
    • Incorrectly categorized documents
    • Documents assigned to the fallback category
  4. Identify patterns in the errors

Testing Checklist:

โœ“ Upload at least 3 documents for each major category
โœ“ Include edge cases (documents that could fit multiple categories)
โœ“ Include documents with poor quality scans or handwriting
โœ“ Include documents in different formats (typed, handwritten, forms)
โœ“ Review categorization within 24 hours of upload

Example Test Results:

Test Case: Personal Injury Claim #12345

Correctly Categorized (18/30):
โœ“ Medical Records: 5/5 correct
โœ“ Police Reports: 3/3 correct
โœ“ Treatment Plans: 4/5 correct (1 went to Medical Records)
โœ“ Invoices: 3/3 correct
โœ“ Legal Correspondence: 3/4 correct (1 went to Miscellaneous)

Incorrectly Categorized (12/30):
โœ— IME Report โ†’ Medical Records (should be Medical Assessments)
โœ— Witness Statement โ†’ Miscellaneous (should be Legal & Investigation)
โœ— Clinical Notes โ†’ Treatment Plans (should be Medical Documentation)

Step 6: Refine Based on Results

Use your test results to improve category definitions and folder organization.


Common Issues and Solutions:


Issue 1: Documents Going to Wrong Category Within Same Folder


Example: Treatment Plans being categorized as Medical Records


Solution: Strengthen the distinction in definitions and recognition tips.

Before:
Medical Records: "Documents about patient health"
Treatment Plans: "Plans for patient care"

After:
Medical Records: "Historical documentation of past medical visits, 
diagnoses, and treatments already provided. Look for past-tense language 
and completed visit summaries."

Treatment Plans: "Forward-looking documents proposing future care, 
including recommended treatments, therapy schedules, and anticipated 
outcomes. Look for future-tense language and treatment recommendations."

Issue 2: Documents Going to Fallback/Miscellaneous


Example: Witness Statements not being recognized


Solution: Add more specific recognition tips and key topics.

Before:
Witness Statement: "Statements from witnesses"
Recognition Tips: "Witness accounts"

After:
Witness Statement: "First-hand accounts from individuals who observed 
an incident, typically used in legal proceedings or insurance investigations."
Recognition Tips: "Look for phrases like 'I witnessed', 'I observed', 
'I saw', sworn statements, notarized signatures, witness contact information, 
and narrative descriptions of events. May include affidavits or statutory 
declarations."
Key Topics: "eyewitness, observation, incident description, sworn statement, 
affidavit, testimony, account of events"

Issue 3: Similar Documents Going to Different Categories


Example: Some medical assessments going to Medical Records, others to IME Reports


Solution: Consolidate categories or create clearer boundaries.

Option A - Consolidate:
Merge "IME Reports" into "Medical Assessments" and update definition to 
include both independent exams and specialist assessments.

Option B - Clarify Boundaries:
IME Reports: "Independent medical examinations conducted by third-party 
physicians for legal or insurance purposes. Look for phrases like 
'independent medical examination', 'IME', 'at the request of', and 
evaluator credentials."

Medical Assessments: "Specialist evaluations and diagnostic assessments 
from treating physicians. Look for detailed diagnostic findings, specialist 
opinions, and treatment recommendations from ongoing care providers."

Step 7: Implement and Monitor

Roll out your improved folder structure and monitor ongoing performance.


Action Items:

  1. Update all organization-level categories with improved definitions
  2. Communicate changes to your team
  3. Monitor categorization accuracy on new cases for 2-4 weeks
  4. Collect feedback from case managers
  5. Make iterative improvements

Monitoring Metrics:

Track these metrics to measure improvement:

  • Categorization Accuracy: % of documents correctly categorized on first pass
  • Fallback Rate: % of documents going to Miscellaneous/fallback category
  • Manual Recategorization Rate: % of documents manually moved by users
  • Category Usage: Which categories are used most/least frequently
  • User Satisfaction: Feedback from team members on ease of finding documents

Example Monitoring Dashboard:

Week 1 (Before Folders):
- Categorization Accuracy: 68%
- Fallback Rate: 22%
- Manual Recategorization: 31%

Week 4 (After Folders + Refinement):
- Categorization Accuracy: 87%
- Fallback Rate: 8%
- Manual Recategorization: 12%

Improvement: +19% accuracy, -14% fallback rate, -19% manual work

Real-World Example: Insurance Claims Organization

Scenario

An insurance company processes accident benefit claims with hundreds of documents per case. They were experiencing:

  • 35% of documents going to the wrong category
  • 25% of documents going to "Miscellaneous"
  • Case managers spending 2-3 hours per case manually reorganizing documents

Before: No Folder Structure

Categories (Flat List):
1. Medical Records
2. Reports
3. Forms
4. Letters
5. Bills
6. Photos
7. Documents
8. Other
9. Miscellaneous

Problems:

  • "Reports" was too vague (medical reports? police reports? investigator reports?)
  • "Letters" could be legal, medical, or administrative
  • "Documents" was meaningless
  • Definitions were generic and unhelpful to the AI

After: Folder Structure Implementation


Step 1: Audit and Regroup

They identified 6 major document domains in their workflow:

  1. Medical documentation
  2. Legal and liability
  3. Financial and billing
  4. Investigation and evidence
  5. Administrative and forms
  6. Communication

Step 2: Create Folder Structure

๐Ÿ“ Medical Documentation
   - Medical Records (patient charts, visit summaries)
   - Treatment Plans (proposed care, therapy schedules)
   - IME Reports (independent medical examinations)
   - Clinical Notes (doctor's observations, progress notes)
   - Diagnostic Results (lab tests, imaging, x-rays)
   - Prescriptions (medication orders, pharmacy records)

๐Ÿ“ Legal & Liability
   - Police Reports (accident reports, collision statements)
   - Legal Correspondence (attorney letters, legal notices)
   - Witness Statements (eyewitness accounts, affidavits)
   - Court Documents (filings, orders, judgments)
   - Liability Assessments (fault determinations)

๐Ÿ“ Financial & Billing
   - Medical Invoices (hospital bills, provider charges)
   - Payment Records (EOBs, payment confirmations)
   - Benefit Calculations (coverage determinations)
   - Receipts (out-of-pocket expenses)

๐Ÿ“ Investigation & Evidence
   - Accident Scene Photos (collision images, damage photos)
   - Surveillance Reports (investigator observations)
   - Vehicle Damage Assessments (repair estimates, appraisals)
   - Incident Reports (employer reports, safety reports)

๐Ÿ“ Forms & Applications
   - Claim Forms (initial applications, OCF forms)
   - Authorization Forms (consent, release forms)
   - Questionnaires (claimant statements, health history)
   - Disability Certificates (functional assessments)

๐Ÿ“ Administrative
   - Correspondence (general letters, notifications)
   - Case Notes (internal notes, file memos)
   - Miscellaneous (fallback category)

Step 3: Enhanced Definitions

Example for "Medical Records":

Definition: "Comprehensive documentation of a patient's medical history, 
including past and current diagnoses, treatments received, medications, 
allergies, and clinical observations from healthcare providers. These are 
historical records of care already provided, not future treatment plans."

Expected Content: "Patient demographics, medical history, chief complaints, 
physical examination findings, vital signs, diagnoses (ICD-10 codes), 
treatments administered, medications prescribed, allergies, immunization 
records, provider notes, visit summaries, discharge summaries."

Recognition Tips: "Look for medical terminology, provider letterhead 
(hospitals, clinics, physician offices), patient identifiers, vital signs 
(blood pressure, temperature, heart rate), diagnosis codes, medication lists, 
and clinical observations. Often includes headers like 'Medical History', 
'Patient Chart', 'Clinical Summary', or 'Medical Report'. May contain 
multiple visits over time."

Key Topics: "medical history, diagnosis, treatment, medications, vital signs, 
clinical observations, patient care, healthcare provider, hospital records, 
physician notes, allergies, immunizations"

Naming Convention: "MedicalRecord_[PatientName]_[Date]_[Provider]"

Step 4: Testing Results

They tested with 50 documents from 5 recent cases:

Initial Test Results:
- Categorization Accuracy: 76% (38/50 correct)
- Fallback Rate: 14% (7/50 to Miscellaneous)
- Common Errors:
  * IME Reports โ†’ Medical Records (3 cases)
  * Legal Correspondence โ†’ Administrative (2 cases)
  * Treatment Plans โ†’ Medical Records (2 cases)

Step 5: Refinements

Based on errors, they:

  1. Strengthened IME Reports recognition tips to emphasize "independent" and "third-party"
  2. Added specific legal phrases to Legal Correspondence recognition tips
  3. Clarified Medical Records as "historical" vs Treatment Plans as "future-looking"

Step 6: Results After Refinement

After 4 weeks of monitoring:

Final Results:
- Categorization Accuracy: 91% (up from 65% before folders)
- Fallback Rate: 6% (down from 25%)
- Manual Recategorization: 9% (down from 35%)
- Time Savings: 1.5 hours per case (down from 2-3 hours)

Annual Impact:
- 500 cases/year ร— 1.5 hours saved = 750 hours saved
- At $50/hour = $37,500 in labor savings
- Plus improved consistency and faster case processing

Advanced Strategies

Strategy 1: Use Folder Names in Recognition Tips

Reference the folder context in your category definitions to reinforce the AI's understanding.


Example:

Category: Clinical Notes
Folder: Medical Documentation

Recognition Tips: "Within medical documentation, clinical notes are the 
provider's real-time observations during patient encounters. Look for 
narrative descriptions of symptoms, physical examination findings, and 
clinical impressions. Unlike formal medical records or treatment plans, 
these are often less structured and more conversational in tone."

Strategy 2: Create Parallel Folder Structures for Different Case Types

If you handle multiple types of cases, create separate folder structures for each.


Example:

For Personal Injury Cases:
๐Ÿ“ Medical Documentation
๐Ÿ“ Legal & Liability
๐Ÿ“ Financial & Billing
๐Ÿ“ Investigation & Evidence

For Workers Compensation Cases:
๐Ÿ“ Medical Documentation
๐Ÿ“ Employment Records
๐Ÿ“ Workplace Safety
๐Ÿ“ Benefits & Claims
๐Ÿ“ Legal & Appeals

You can use the same category names in different folders, and the folder context helps the AI distinguish them.


Strategy 3: Use Folder Colors Consistently

Assign consistent colors to folders to create visual patterns that help both AI and human users.


Example:

๐Ÿ“ Medical Documentation (Blue)
   - All medical categories use shades of blue

๐Ÿ“ Legal & Liability (Red)
   - All legal categories use shades of red

๐Ÿ“ Financial & Billing (Green)
   - All financial categories use shades of green

Strategy 4: Create a "Staging" Folder for Ambiguous Categories

If you have categories that don't fit neatly into your main folders, create a temporary staging folder.


Example:

๐Ÿ“ Medical Documentation
๐Ÿ“ Legal & Liability
๐Ÿ“ Financial & Billing
๐Ÿ“ Investigation & Evidence
๐Ÿ“ Administrative
๐Ÿ“ Under Review (Staging)
   - Correspondence (could be medical, legal, or administrative)
   - General Reports (needs further classification)
   - Miscellaneous (fallback)

Over time, you can split these ambiguous categories into more specific ones or merge them into existing folders.


Common Mistakes to Avoid

Mistake 1: Too Many Folders

Problem: Creating 15+ folders makes the system overwhelming and defeats the purpose of organization.


Solution: Stick to 3-7 main folders. If you need more granularity, use subcategories within folders, not more folders.


Mistake 2: Overlapping Folder Purposes

Problem: Creating folders like "Medical Documents" and "Healthcare Records" that mean essentially the same thing.


Solution: Ensure each folder has a distinct purpose. If two folders seem similar, merge them.


Mistake 3: Generic Category Definitions

Problem: Definitions like "Documents related to medical stuff" don't help the AI.


Solution: Write specific, detailed definitions with concrete examples and recognition tips.


Mistake 4: Not Testing Before Rollout

Problem: Implementing a new folder structure across all cases without testing.


Solution: Always test with sample documents first, refine based on results, then roll out gradually.


Mistake 5: Setting and Forgetting

Problem: Creating folders once and never reviewing or updating them.


Solution: Monitor categorization accuracy regularly and make iterative improvements based on real-world performance.


Mistake 6: Ignoring User Feedback

Problem: Not listening to case managers who manually recategorize documents.


Solution: Track which documents are frequently moved and why, then update definitions accordingly.


Mistake 7: Over-Categorizing

Problem: Creating 50+ highly specific categories that confuse the AI and users.


Solution: Start with 10-20 broad categories and add more only when you see consistent patterns of documents that don't fit.


Measuring Success

Key Performance Indicators

Track these metrics to measure the impact of your folder organization:


1. Categorization Accuracy Rate

Formula: (Correctly Categorized Documents / Total Documents) ร— 100
Target: >85%
How to Measure: Manual review of sample cases

2. Fallback Category Rate

Formula: (Documents in Miscellaneous / Total Documents) ร— 100
Target: <10%
How to Measure: Query your system for fallback category usage

3. Manual Recategorization Rate

Formula: (Documents Manually Moved / Total Documents) ร— 100
Target: <15%
How to Measure: Track document category changes by users

4. Time to Organize Documents

Measure: Average time case managers spend organizing documents per case
Target: <30 minutes per case
How to Measure: User surveys or time tracking

5. User Satisfaction

Measure: Team feedback on ease of finding documents
Target: >4/5 rating
How to Measure: Regular surveys

Before and After Comparison

Document your baseline before implementing folders, then measure again after 4-6 weeks:

Baseline (Before Folders):
- Categorization Accuracy: ____%
- Fallback Rate: ____%
- Manual Recategorization: ____%
- Time per Case: ____ hours
- User Satisfaction: ____/5

After Implementation (Week 6):
- Categorization Accuracy: ____%
- Fallback Rate: ____%
- Manual Recategorization: ____%
- Time per Case: ____ hours
- User Satisfaction: ____/5

Improvement:
- Accuracy: +____%
- Fallback: -____%
- Manual Work: -____%
- Time Saved: ____ hours
- Satisfaction: +____ points

Conclusion

Category folders are a powerful tool for improving document categorization in Sky AI. By following this step-by-step approach, you can:

  1. Reduce categorization errors by 20-30%
  2. Decrease fallback category usage by 15-20%
  3. Save 1-2 hours per case in manual organization time
  4. Improve team efficiency and satisfaction

Key Takeaways:

  • Start with an audit of your current categories
  • Design 3-7 clear folders that match your workflow
  • Write specific, detailed definitions for each category
  • Test thoroughly before rolling out
  • Monitor and refine based on real-world performance
  • Avoid common mistakes like over-categorizing or generic definitions

Remember: The goal isn't perfection on day one. It's continuous improvement. Start with a solid foundation, test with real documents, and refine based on results. Over time, your categorization system will become more accurate and your team will spend less time manually organizing documents.


Next Steps

  1. Schedule a planning session with your team to audit current categories
  2. Design your folder structure using the guidelines in this article
  3. Create a test case with 20-30 representative documents
  4. Implement and test your new folder structure
  5. Monitor results for 4-6 weeks
  6. Refine and optimize based on performance data
  7. Share best practices with your team

For additional support, refer to the companion guide "How Folders and Categories Work in Sky AI" for a deeper understanding of the underlying system.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.

Still need help? Contact Us Contact Us