Get Started
Quick Start
  • Upload Input CSV
  • Upload Target CSV
  • Pick columns → Start
Use the sidebar on the left to add your files.
No data?
Data Requirements
  • CSV files with headers
  • Input: items to match
  • Target: reference list

What This Tool Does

This application matches text descriptions between two datasets using AI-powered semantic analysis. Upload your input items and target reference list, select the columns to match, and the tool will find the best semantic matches based on meaning rather than exact text.

Key Features
  • Semantic matching using state-of-the-art embeddings
  • Adjustable similarity threshold for fine-tuning
  • Interactive visualizations and data export
  • Text cleaning options for better matches
Matching Setup

Similarity Threshold
Method
Semantic Embedding
Model
Note
Items below threshold
marked as NO MATCH.
Adjust for performance
for your dataset.

Tip: Return to Step 1: Data & Configure to adjust threshold or column selections, then re-run mapping.
Export All DataExport Matches
Interactive Visualizations

About FoodMapper

FoodMapper

Advanced semantic matching tool for aligning food descriptions across nutritional databases


Overview

FoodMapper solves a major problem in nutritional research: accurately matching food items between different databases that use varying naming conventions and descriptions. This tool uses neural language processing to find semantic matches based on meaning rather than exact text matching.

The Challenge

Nutritional databases often describe the same foods differently:

  • "2% milk" vs "Milk, reduced fat, 2% milkfat"
  • "OJ" vs "Orange juice, raw"
  • "Whole wheat bread" vs "Bread, whole-wheat, commercially prepared"

Traditional text matching fails to recognize these as the same items, leading to incomplete or inaccurate nutritional analyses.

Our Solution

FoodMapper uses semantic embeddings to understand the meaning behind food descriptions, enabling accurate matches even when the exact wording differs.

AI Model

Powered by GTE-Large
Neural embedding model

Performance

Process thousands of items/minute
Batch processing system

Accuracy

Semantic understanding
Matches based on meaning

Control

Adjustable thresholds
Fine-tune match sensitivity

Key Features
  • Semantic Matching: Understands food descriptions using neural embeddings
  • Batch Processing: Handle thousands of items efficiently with concurrent processing
  • Interactive Visualizations: Explore match distributions and patterns with 8 chart types
  • Data Export: Download results as CSV with all original data preserved
  • Text Cleaning: Optional preprocessing to potentially improve match quality
  • Real-time Preview: See data transformations before processing
Use Cases
  • Harmonizing dietary intake data with nutrient databases
  • Linking research datasets to food composition tables
  • Standardizing food nomenclature across studies
  • Quality control for nutritional data entry
  • Cross-referencing international food databases

Development Team

Principal Investigator: Dr. Danielle G. Lemay
Research Molecular Biologist

Developer: Richard Stoker
IT Specialist (Scientific)

Organization:
USDA Agricultural Research Service
Western Human Nutrition Research Center
Davis, California


Version: 1.0.0

Contact: richard.stoker@usda.gov
GitHub