Jelmer Rotteveel at Soar Music Group needed structured lead lists: artist contacts, curator profiles, music industry decision-makers. Eprecisio managed development of the scraping scripts that generated those lists, running them repeatedly as the business needed fresh data. Over years of running and improving those scripts, a pattern became clear: the underlying need was not a script, it was a platform. One that others in the music industry could use too.

That became FindSocial.

From scripts to platform: how the pilot evolved

The first phase was purely internal. Ehtisham's team built and iterated scraping scripts for Soar Music Group's outreach campaigns. Each iteration improved accuracy, coverage, and the quality of the data returned. By the time the idea of building a proper platform emerged, the scraping logic had already been refined through years of real-world use against real music industry data.

That history mattered. Most lead platforms in any vertical are built by people who understand software but not the data they are collecting. FindSocial was built by a team that had spent years understanding specifically what makes music industry lead data useful versus useless: what signals indicate an active curator versus a dormant one, what contact data is actually reachable, how to distinguish an independent artist from a major label imprint, and how the data decays over time.

The pilot-to-platform transition happened in stages:

Phase	What was built	Output
Phase 1: Internal scripts	Scraping and lead generation for Soar Music Group	Structured CSV exports for outreach campaigns
Phase 2: Data infrastructure	Persistent storage, deduplication, enrichment pipeline	Reusable, queryable lead database
Phase 3: Platform build	Next.js frontend, search, filters, export	Product usable by non-technical users
Phase 4: AI-native scraping	Autonomous agents replacing manual script runs	Continuous 24/7 data collection and refresh
Phase 5: Scale	AWS horizontal scaling, 1M+ profiles in production	Production platform with live lead generation

What we built

Agentic scraping system. The core data collection runs on autonomous agents built to operate continuously without human intervention. The agents navigate multiple data sources across the music industry, handle rate limiting and blocking gracefully, retry failed operations with backoff logic, and flag data quality issues rather than silently passing bad data downstream. Each agent is scoped to a specific data source and profile type, so failures in one do not cascade to others.

Data enrichment pipeline. Raw scraped data is not useful as-is. The enrichment pipeline cross-references profiles across multiple platforms, validates contact information, resolves duplicates, scores lead quality based on engagement signals, and categorises profiles by type (artist, curator, label, agency, influencer). A profile in FindSocial is not a raw scraped record. It is a verified, enriched, scored contact.

Real-time search and filtering. The PostgreSQL-backed search layer returns results in milliseconds across 1M+ profiles. Filters cover genre, location, follower count, engagement rate, platform presence, and contact availability. A music promotion team can find exactly the curators or artists they need for a specific campaign without exporting and filtering in a spreadsheet.

Production platform. Next.js frontend designed for non-technical users. Browse, filter, build lists, and export. The interface reflects how music industry professionals actually think about outreach, not how a developer would structure a database query.

Component	What it does	Technology
Agentic scrapers	Continuous multi-source data collection, failure-resilient	Python, custom agent framework
Enrichment pipeline	Cross-referencing, validation, deduplication, quality scoring	Python, PostgreSQL
Lead database	1M+ profiles, structured and queryable	PostgreSQL, AWS RDS
Search layer	Real-time full-text search with multi-dimensional filters	PostgreSQL full-text, custom indexing
Platform frontend	Browse, filter, export, manage lists	Next.js, FastAPI
Infrastructure	Scalable on scraping and search load	AWS, horizontal autoscaling

What made this technically hard

Data quality at scale. Getting to 1M+ profiles is straightforward if you are willing to accept bad data. Getting to 1M+ profiles that are actually useful requires continuous quality enforcement. Contact information goes stale. Artists change labels. Curators go inactive. The enrichment pipeline has to catch degradation and either re-validate or flag the profile, not silently serve outdated data to users.

Rate limits and blocking across multiple sources. The music industry's data lives across Spotify, SoundCloud, Instagram, YouTube, music blogs, and dozens of niche platforms. Each has different rate limits, different anti-scraping approaches, and different data structures. The agent architecture was specifically designed to handle this heterogeneity: each source has its own agent with its own rate management, so one platform blocking a scraper does not stop data collection from the others.

Deduplication across sources. The same artist might appear under different names, with different spellings, on different platforms. Building a deduplication layer that correctly merges profiles without falsely collapsing distinct people into one record, and without missing real duplicates, was one of the more technically intricate parts of the data pipeline.

Results

Metric	Before FindSocial	After FindSocial
Lead data availability	CSV exports from manual script runs	1M+ profiles in a live, searchable platform
Data freshness	Stale between script runs	Continuously updated by autonomous agents
Time to build a targeted list	Hours of manual export and filtering	Minutes via platform search and filters
Data quality	Variable, unchecked	Enriched, validated, quality-scored
Access	One internal team	Multi-user platform for the music industry
Coverage	Artists and curators in one vertical	Artists, curators, labels, agencies, influencers

What the platform represents

FindSocial is a proof of how pilot work compounds into real products when the underlying domain knowledge is there. The scraping scripts that started as a service for one client turned into a platform because the years of running and refining them produced genuine understanding of the data.

The music industry lead problem is not unique. Almost every vertical has the same challenge: contacts are scattered across dozens of platforms, quality is inconsistent, and the people who need the data are not technical. What makes FindSocial work is the combination of AI-native data collection and the domain knowledge built through years of working in this specific space.

For how we build AI-native data platforms from pilot to production, see our Development service.

If you are building a data product or lead platform and need a team that can take it from scraping scripts to production AI architecture, book a free 30-minute call.

FindSocial: AI-Native Lead Platform for the Music Industry

From scripts to platform: how the pilot evolved

What we built

What made this technically hard

Results

What the platform represents

Want Similar Results for Your Business?

Your infra shouldn't be the thing slowing you down.