Core
Data Extraction
RateScan uses advanced AI and OCR technology to automatically extract structured data from your rate confirmation documents. This guide explains what fields are extracted, how extraction works, and how to improve accuracy.
How Extraction Works#
RateScan's extraction process combines multiple technologies to accurately identify and extract information:
Technology Stack
- Optical Character Recognition (OCR) - Converts images and PDFs into machine-readable text
- Natural Language Processing (NLP) - Understands document structure and context
- Machine Learning Models - Trained specifically on rate confirmation documents
- Pattern Recognition - Identifies common document layouts and field patterns
Extraction Pipeline
- Document Analysis - Analyzes document structure and layout
- Text Extraction - Extracts all text using OCR
- Field Identification - Identifies key fields using AI models
- Data Structuring - Organizes extracted data into structured format
- Validation - Validates and formats extracted data
- Presentation - Displays results in editable table format
Processing Time: Extraction typically takes 5-20 seconds depending on document complexity. Multi-page documents may take longer.
Extracted Fields#
RateScan extracts the following categories of information from rate confirmation documents:
Company and Header Information
- Company Name - The name of the company issuing the rate con
- Company Logo - Detected when present (for display purposes)
- Header Information - Any header text or branding
Load Identification
- Load Number - Primary load identifier
- Truck Number - Vehicle or truck identifier
- Reference Number - Additional reference identifiers
- PO Number - Purchase order numbers when present
- Pro Number - Pro numbers for tracking
Shipper Information
- Shipper Company Name - Name of the shipping company
- Shipper Address - Complete street address
- Shipper City, State, ZIP - Location information
- Shipper Contact - Contact person or phone number
- Pickup Date - Scheduled pickup date
- Pickup Time - Pickup time or appointment window
- Pickup Instructions - Special pickup instructions or notes
Consignee Information
- Consignee Company Name - Name of the receiving company
- Consignee Address - Complete delivery address
- Consignee City, State, ZIP - Delivery location
- Consignee Contact - Contact person or phone number
- Delivery Date - Scheduled delivery date
- Delivery Time - Delivery time or appointment window
- Delivery Instructions - Special delivery instructions or notes
Load Items and Details
- Item Descriptions - Description of items being shipped
- Quantities - Number of items or units
- Weights - Weight per item and total weight
- Dimensions - Length, width, height when available
- Pallet Count - Number of pallets
- Piece Count - Number of pieces
- Destination - Final destination for multi-stop loads
Financial Information
- Rate - Base rate or line haul rate
- Total Amount - Total charge amount
- Accessorial Charges - Additional charges (fuel surcharge, detention, etc.)
- Payment Terms - Payment terms and conditions
- Currency - Currency type (typically USD)
Totals and Summaries
- Total Pickup Weight - Combined weight of all items
- Highest Amount - Highest individual charge amount
- Grand Total - Total of all charges
- Mileage - Distance when available
Additional Information
- Accessorials - Additional services or charges
- Special Instructions - Special handling instructions
- Notes - General notes or comments
- Temperature Requirements - Temperature-controlled shipping requirements
- Hazmat Information - Hazardous materials information when present
Extraction Accuracy#
RateScan's extraction accuracy depends on several factors. Understanding these can help you achieve better results:
Factors Affecting Accuracy
Document Quality
- Image Resolution - Higher resolution (300+ DPI) improves accuracy
- Image Clarity - Sharp, clear images produce better results
- Contrast - Good contrast between text and background
- Orientation - Properly oriented documents extract better
Document Structure
- Standard Formats - Standard rate con formats extract more accurately
- Clear Labels - Documents with clear field labels improve identification
- Consistent Layout - Consistent document layouts improve accuracy
- Handwriting - Handwritten text is less accurate than printed text
Field Complexity
- Simple Fields - Numbers, dates, and standard formats extract very accurately
- Addresses - Well-formatted addresses extract accurately
- Free Text - Notes and descriptions may require more review
- Unusual Formats - Non-standard formats may need manual correction
Typical Accuracy: For well-scanned documents, RateScan achieves 90-95% accuracy on structured fields like addresses, dates, and numbers. Free-form text fields may require more review.
Reviewing and Editing Extracted Data#
All extracted data is presented in an editable table format. You can review and correct any fields as needed:
Review Interface
- Data Table - All extracted fields displayed in organized columns
- Inline Editing - Click any field to edit directly
- Auto-save - Changes save automatically as you edit
- Document Viewer - View original document alongside extracted data
- Field Highlighting - Some fields may highlight corresponding document areas
Common Corrections
OCR Errors
- Character Confusion - Common OCR errors include:
- "0" (zero) vs "O" (letter O)
- "1" (one) vs "I" (letter I) vs "l" (lowercase L)
- "5" vs "S"
- "8" vs "B"
- Spacing Issues - Extra or missing spaces in addresses or names
- Punctuation - Missing or incorrect punctuation
Format Corrections
- Date Formats - Ensure dates are in your preferred format
- Address Formatting - Verify address components are correctly separated
- Phone Numbers - Format phone numbers consistently
- Currency - Verify decimal places and currency symbols
Missing Data
- Incomplete Fields - Some fields may be partially extracted
- Manual Entry - Add missing information manually
- Cross-Reference - Check original document for missing data
Best Practices for Review
- Verify Critical Fields First - Check load numbers, addresses, and amounts
- Compare with Original - Always compare extracted data with the original document
- Check Calculations - Verify totals and calculations
- Review Dates - Ensure dates are correct and in proper format
- Validate Addresses - Verify addresses are complete and properly formatted
- Check for Duplicates - Ensure information isn't duplicated across fields
Improving Extraction Accuracy#
Document Preparation
- High-Quality Scans - Use 300 DPI or higher resolution
- Good Lighting - Ensure even lighting without shadows
- Flat Documents - Avoid wrinkles, folds, or curved surfaces
- Clean Documents - Remove staples, paper clips, or tape
- Proper Orientation - Ensure documents are right-side up
Scanning Best Practices
- Use Document Scanner - Document scanners produce better results than cameras
- Consistent Settings - Use consistent scanning settings for similar documents
- Color vs Grayscale - Grayscale often works well for text documents
- Avoid Compression - Use minimal compression to preserve quality
Post-Processing
- Image Enhancement - Enhance contrast if needed before uploading
- Rotation Correction - Ensure documents are properly oriented
- Crop Unnecessary Areas - Remove borders or unnecessary white space
Important: If extraction accuracy is consistently poor, check your document quality first. Poor quality scans are the most common cause of extraction errors.
Field-Specific Notes#
Addresses
- Addresses are extracted as structured data (street, city, state, ZIP)
- Company names are separated from addresses
- PO Box addresses are handled correctly
- Suite numbers and apartment numbers are included
Dates and Times
- Dates are extracted in various formats and normalized
- Time formats are standardized
- Date ranges are handled when present
- Appointment windows are extracted when specified
Financial Data
- Currency symbols are detected and handled
- Decimal places are preserved
- Totals are calculated when individual items are present
- Accessorial charges are separated from base rates
Load Items
- Multiple line items are extracted separately
- Quantities and weights are associated with descriptions
- PO numbers are linked to line items when present
- Destinations are extracted for multi-stop loads
Troubleshooting Extraction Issues#
No Data Extracted
- Check Document Quality - Ensure image is clear and readable
- Verify Format - Ensure document is a supported format
- Check Orientation - Ensure document is right-side up
- Try Re-uploading - Sometimes re-processing helps
Incorrect Field Extraction
- Edit Manually - Use inline editing to correct fields
- Check Original - Compare with original document
- Improve Quality - Re-scan with better quality if needed
- Report Patterns - If errors are consistent, note them for improvement
Missing Fields
- Check Document - Verify field exists in original document
- Field Location - Some fields may be in unexpected locations
- Manual Entry - Add missing fields manually
- Document Type - Some document types may not have all fields
Related Documentation#
- Core → Uploading - Learn about uploading documents for best extraction results
- Getting Started → Welcome - Overview of RateScan features
- Core → Security - Security practices for extracted data