How 567,090 iMessages, 139,000 photos, 1,002 Google Voice records, and 130 million tokens become one intelligent assistant. A visualization of the raw scale of data that ARIA ingests, analyzes, and distills into personal intelligence.
Eight headline statistics that capture the scale of what ARIA processes. Every number represents real data from the production database — no projections, no estimates.
The journey from raw sensor readings and photo metadata to personally relevant intelligence. Each layer refines, filters, and distills — turning hundreds of thousands of data points into a handful of genuinely useful insights.
Everything flows in. iMessage history spanning years of conversations, photos from the camera roll, Google Voice archives, sensor data from iOS, calendar events, contacts, health readings, music libraries, HomeKit devices — a continuous stream of raw life data captured by the platform.
Raw data is processed by AI models — 567,090 iMessages are analyzed in batches of 50 across 10,707 jobs, extracting facts about people and relationships. 412 Google Voice files (353 text conversations + 59 voicemails) are transcribed and analyzed. Photos are described by vision models, knowledge entities are extracted and linked, and context changes are detected across 15 sources every 5 minutes.
Analysis output is refined into structured knowledge. Memory records are deduplicated (35% superseded), owner profile facts are organized across 9 domains, and knowledge graph entities are linked with typed relationships. The noise is stripped away; only verified intelligence remains.
The final layer. Hundreds of thousands of data points reduced to carefully selected, deeply analyzed, personally relevant intelligence. What actually reaches the user is the tip of a massive data iceberg — each insight backed by the full weight of everything below it.
202,751 total memory records distilled from every conversation, analysis run, and background job. After deduplication and supersession, 131,897 remain active — a 35% compression rate that keeps only the freshest, most relevant knowledge.
Every conversation with ARIA can produce memory updates. Claude analyzes the dialogue and extracts durable facts — preferences ("prefers morning workouts"), people ("brother lives in Austin"), context ("working on product roadmap"), and recurring patterns ("reads every evening"). Background jobs also generate memories from iMessage analysis, photo descriptions, and health data.
When ARIA learns something that contradicts or updates an existing memory, it doesn't delete the old one — it supersedes it. The old record stays for audit trail purposes, marked with a pointer to its replacement. This versioning means ARIA can explain why she changed her understanding of a fact.
131,016 structured facts organized across 9 domains in the owner profile. Each fact has a confidence score, evidence chain, source attribution, and temporal validity. This is ARIA's deep understanding of the user's life.
139,000 photos in the iOS library. Each one passes through a multi-stage pipeline: metadata sync, AI description, entity extraction, and knowledge graph integration. The result is a searchable, queryable visual memory.
Total photos in
iOS library
Synced to ARIA
(46% complete)
Vision model
analysis complete
Facts extracted
into KG
90,224 calls across three AI providers. The majority runs on Google's free tier for classification and vision, with Anthropic Claude handling deep reasoning and OpenAI powering embeddings.
To put that in perspective: 130 million tokens is approximately 100 million words — equivalent to roughly 330 novels worth of text processed by AI models.
Four composite examples showing the end-to-end data journey. Each demonstrates how disparate raw signals are combined, analyzed, and distilled into something genuinely useful.
A photo taken at a restaurant with a friend is synced to ARIA. The AI vision model describes the scene — a candlelit dinner, two people, downtown restaurant visible through the window. Entity extraction identifies the location (dining district), the person (close friend M.), and the activity (dinner celebration). Knowledge facts are created and linked. When the user later asks "where did I eat last week?" — ARIA knows, complete with who was there and what the occasion was.
Health data shows a declining step count over two weeks. Calendar data simultaneously shows back-to-back meetings filling every afternoon. The context accumulator detects both changes. The significance gate confirms this is a meaningful pattern, not noise. The anticipation engine generates an insight: "Your daily activity has dropped 30% while your meeting load doubled. Consider blocking 30 minutes between meetings for a walk." Delivered via push notification at an optimal time.
iMessage history analysis extracts that the user discussed vacation plans with three friends over the past month. The knowledge graph links each person to the trip entity, capturing discussed dates, proposed destinations, and logistical details. When the user asks ARIA about trip planning, she already knows who is going, what dates were discussed, which destinations came up, and even which friend suggested each option — all without the user needing to re-explain anything.
Music data analysis reveals patterns: ambient and focus music during work hours, upbeat indie rock on weekends, jazz in the evenings. The music taste profile is auto-generated from 1,091 songs across 50 playlists. ARIA weaves this understanding into conversations naturally — referencing listening moods, suggesting music-related context, and understanding when the user mentions wanting something "for a chill evening" versus "something energizing."
26,836 background jobs processed by aria-tempo. The silent workhorse that runs analysis pipelines, syncs data, generates insights, and maintains the knowledge graph — all autonomously, 24/7.
All of this — 130 million tokens, 90,224 LLM calls, 26,836 background jobs, 202,751 memories, a 131,016-fact knowledge graph — costs a total of $14.81. Here's the breakdown.
Every table in the system, organized by function. This is the full scope of what ARIA stores, processes, and reasons about.