Adopting big data for business intelligence

It’s hardly contested nowadays that business intelligence is beneficial to any organization, regardless of industry. Data optimization and governance have been shown to produce better decision-making in the long-term.

That doesn’t mean data implementations have been perfect. Some companies have been failing efforts to become data-driven at a much larger scale than one would expect. Others, however, are rushing ahead and have begun using external data sources en masse.

Big data has been incredibly useful for those who have successfully managed their internal sources. Strategic use of big data enables organizations to get a better handle on understanding their customers, create more enticing marketing campaigns, and forecast demand with better accuracy.

Big data and external sources

Following the five Vs model, two of the key determinants, important in our case, are volume and velocity. Big data derived from external sources is different from internal ones, because there is no limit to it.

Internal sources will always be pre-limited by the size of the business. In some poetic sense, the company itself is at the mercy of its customers to get such data. If the organization is small in both operations and revenue, there won’t be much data produced. Trying to work out big insights from small datasets is often a recipe for failure.

External sources, however, are limited by the rate of production of data on the internet. In practice, the velocity and volume of data is virtually unlimited, only restricted by the technical capabilities. There’s so much information being produced on a daily basis that even after all considerations and trimming of sources, there’s something to be found and analyzed.

As such, both the volume and velocity of big data, primarily from external sources, are magnitudes higher than internal resources would allow. Additionally, there’s an important qualitative difference in the data.

External sources provide us with data that is left from a wide variety of different sources. Most of it has no direct relationship to the business who will utilize that information, making it much more unbiased than anything an internal source could produce.

In the end, a combination of both sources produces big data. External ones, however, play a much bigger volume and velocity-wise. It’s important to note that these two sources are complimentary. While some of the insights they provide can overlap (such as customer habits), they can also offer unique signals that can help enhance overall business strategy.

Hidden BI gems in big data

External sources may not always produce unique signals that make us change strategy, but they empower our existing methods. Additionally, they may provide insights that would otherwise be unavailable.

Take the usage of CRMs, for example. Almost all digital businesses use these systems in their daily operations. Customer profiles, however, have expanded in many directions. There is now potentially useful data available on businesses and individuals scattered across the entire web.

Social media is a great example. Many companies can choose to pull publicly available data from social sources as most of their customers will have some form of presence. These enrichments would be particularly useful for those working in B2B.

On the other hand, a combination of internal and external sources can create better planning and budgeting options for all businesses. External data lets organizations predict and forecast demand, while internal sources can more accurately represent available resources to meet those needs.

It is especially useful for industries such as ecommerce. External data provides organizations with a glimpse of a better overview of the entire market, its trends, and possibilities. Businesses have successfully used various methods to collect and access vast amounts of external data.

Acquiring big data

Since most digital businesses successfully collect lots of data from internal sources, there’s often no issue with its acquisition. The other counterpart, external data, however, is more complicated.

It can be separated into two distinct categories — traditional and advanced. Traditional external data (i.e., government reports, statistical databases, etc.) has been primarily utilized by financial firms and large ecommerce companies. These are generally enormous datasets that provide insight into big picture overviews of markets and economies.

Advanced external data, however, is somewhat of a newcomer, but has already produced great results. Such data can be considered any publicly available online data, e.g., reviews, pricing information, etc.

When internal sources of information are combined with advanced external data, that’s when big data arises. Integrating these two together isn’t as difficult nowadays as it once was. Plenty of third-party web scraping solution providers and even DaaS businesses can deliver data on request.

There’s no longer a need to build scraping solutions or similar infrastructure in-house. Most of it can be outsourced at fairly efficient prices, making data governance simpler. All it takes is a data warehouse to place the prepackaged information retrieved from a third party.

Analysis can be performed in two ways. The simpler approach is to treat external data as its own complete dataset and to look for insights directly from it without having it interact with internal data. Treating it as something completely independent is often easier, and there’s less room for error.

Sources can be combined, however, if proper labeling is undertaken and data is carefully selected. CRMs, as I’ve mentioned previously, are a great example of a candidate for combination. Data has a tendency to be more insightful the more comprehensive its sets get.


Adopting big data, for most businesses, means getting involved with external sources of information. These have enormous potential hidden within them, even if we are to take them as completely independent. Combined with internal sources, however, they can greatly enhance daily decision-making and business operations.

Photo Credit: PlusONE/Shutterstock

Andrius Palionis is VP of Enterprise Sales at web scraping solution provider

Adopting big data for business intelligence