How to enrich a Business Database

How to enrich a business database

How to enrich and enhance a business database’s quality

The web is full of data about companies: articles written on specialized websites, social networks, directories of companies, showcase websites, etc.

In France, with the recent opening of the SIRENE database by INSEE and the Infogreffe open data program, it has never been easier to identify and aggregate data on (innovative) companies.

However, extracting and processing data is a complex process that requires expertise. It’s in this context that one of our client; a french tech startup operating in the open innovation market, uses our know-how.

Given the fact that our client has a an unqualified, low volume database, the objective of the project is to harvest recurrently, at least once a week, data on innovative companies to enrich its in-house business database.

Method

The first step in our work is to understand the company’s value proposition as well as how it works under the hood. This way, it’s easier for us to identify and match the extraction and enrichment issues.

The second step consists in making an inventory of the existing architecture and/or data requirements :

  • What’s the current volume and the targeted one
  • We determine the existing data model and if it requires any update
  • What’s the minimum quality level required regarding company data?
  • How does the internal data integration process work?

Selecting and aggregating sources

Each data source (each site) is not equal in terms of quality. Some websites provide a better description than others. Some information might be missing on one particular site, etc. We need to work in depth on the sources’ selection in order to decide how to aggregate the data.

We analyze all the sites one after the other and then issue a recommendation based on the inventory previously made. Our goal is to build and validate the entire process, from data collection to data aggregation, to ensure the highest quality of data.

Data enrichment as an innovation lever

Given our client’s activity, it is essential to strengthen its database’s quality. It allows to better screen companies with a better understanding level.

Indeed, open innovation (our client’s activity) consists in:

  • Monitoring innovation to enhance agility and innovation’s initiatives inside companies
  • Building long lasting relationships between innovative companies and SMBs / Corporates

The project results in a daily extraction of business data on a dozen directories.

Thanks to this enrichment’s project, our client has seen its database double in quality AND in volume.

Subscribe to our Newsletter

Read about successful use of web data, new scraping techniques and ways to optimize your business with web data.

Share this post with your friends