BARC Logo

Standard vs Best-of-Breed

Introduction

Choosing the right business intelligence software for an enterprise is a complex job. BI solutions range from data management to BI tools and applications with varying feature sets and user groups. Mixing and matching software packages from several vendors is referred to as the best-of-breed approach. The alternative is to choose a single strategic vendor that provides adequate coverage of all the functional areas required.

Each approach has its advantages and disadvantages. For the best product for each functional area, best-of-breed solutions usually provide richer functionality. Typically, there are disparate cases to be addressed and no single solution will be able to deal with them all. Furthermore, users of existing tools often have a vested interest in keeping them, which makes the introduction of a suite of standard tools for the entire enterprise difficult. This is important because, as The BI Survey shows, satisfying users is a key to successful business intelligence projects.

However there are several arguments in favor of buying a tool suite from a single vendor including price, administration effort and effective data sharing between users. The promise of integrated systems is to provide multiple applications with common infrastructure elements such as user administration, and data management tools as well as a consistent user interface to reduce training costs. The risk is that some pieces of the combined offering may not be very good, which will discourage users from adopting them, increase implementation and maintenance costs, or slow down change management.

There is no practical way to make an exact cost calculation for a business intelligence project. There are simply too many variables. But when attempting to calculate the costs of software it is important to keep issues such as training and implementation costs in mind. Unfortunately, these costs are difficult to know in advance without detailed information about the project. In many cases this information is not available in the earliest phases of the selection process. It is also important to keep in mind that software maintenance can be expensive, so calculations of total cost of ownership have to include some estimation of how long the product will be used and how the number of applications and users might develop.

Best-of-breed systems, which lack the coverage of all the use cases but are designed to excel in one or a few application areas, can also pose challenges. These include increased training and support costs, difficulties integrating with other systems and data management issues caused by redundant data storage.

At the same time, nearly all suites are the result of larger companies acquiring several best-of-breed systems. As a result, it is often the case that customers are sold a set of incompatible products which have been relabeled to appear to be parts of the same product. In this case, the more technical arguments in favor of a suite do not apply, although organizational arguments (see below) may.

There is no simple answer to the question of whether to choose best-of-breed or an integrated system. However, there are some basic criteria to be considered and applied to individual cases.

Purchasing and Licenses

License fees and the subsequent maintenance fees are an important issue in choosing business intelligence software. The BI Survey shows that costs are a common argument against wider deployment.

Systems provided by a single vendor tend to have lower licensing costs. However, they also tend to have more complex licensing schemes, making the true costs harder to analyze. The price advantage of a suite is partly driven by the suite vendors’ common practice of throwing in extras to sweeten the deal, or bundling the product with other offerings such as transactional applications. In most cases, this option is not available to best-of-breed vendors. On the other hand, customers only looking for a subset of all the features a suite offers may find a best-of-breed solution cheaper.

Best-of-breed systems with specific feature sets and more focused product management offer the customer more. However, they tend to be more expensive. A cost justification may be needed to show that the advantages of the system compensate for higher up-front costs and the additional expense of maintaining it. This is often difficult for BI products.

Having fewer products means less effort for the purchasing department. It is easier for the purchasing department to keep track of existing contracts when there are fewer vendors. This argument is probably fairly weak in most cases because the major costs of business intelligence software are not created in the purchase cycle itself. But the purchasing department may have an unduly large influence in the selection process.

Choosing a product to replace several best-of-breed products with a single corporate standard can be problematic. It can be difficult for a company to analyze the requirements of its employees if they are satisfied with the product they are currently using. Business users tend not to have much interest in participating in software projects they do not see as bringing benefits.

Vendor Relationship

Many companies find that they are operating too many different software packages from different vendors and see the need to reduce both the number of products in use and the number of vendors they talk to. There are several advantages in having a small number of vendors.

Suite providers often enjoy a good negotiating position with their established customers. The more different products from the same vendor that are present on a single site, the more tightly the customer is locked into the system. This affects negotiations when the customer needs new modules for an existing system and the customer chooses to replace part or all of the system with a competing tool. However, best-of-breed solutions can sometimes provide an antidote to this problem by allowing the customer to replace a complete system piecemeal.

Large customers are more important to each of the remaining vendors and have an advantage in negotiations as a result. Negotiating with BI vendors on price often leads to significant reductions from the list price, and large customers are usually in a better position to apply pressure to the sales team to cut the price.

Training

In a best-of-breed environment, IT will require additional training to administrate each system. This could also mean supporting separate platforms — operating systems, databases, Web servers and programming languages. The same applies to the end-users, who may find it more convenient to have the same interface for separate functions. This offers another strong argument in favor of an integrated solution.

If there are fewer software packages in use, there is less need for training. This applies to end-users, programmers and administrators. Training costs can be very high, and a lack of proper training can also incur significant costs.

If the same product is reused across the enterprise, existing skills can be redeployed on new projects. This reduces the time required to carry out individual software projects. As The BI Survey 9 clearly shows, short project times lead to better results. Software that is more widespread in the company also reduces dependencies on the skills of individuals. In some cases, departmental business intelligence projects are carried out on a shoestring budget without temporary outside support.

Support

An environment in which multiple vendor offerings are being used side-by-side can be more difficult to maintain if the various vendors do not cooperate or point fingers at each other when problems arise. In practice, however, the potential problems caused by conflicts between vendors have little impact in the BI industry. The BI Survey 9 shows that large vendors with broad suites consistently underperform in the area of support. As a result, the argument that a single product offers better support is not strong.

Shared Data and Metadata

With a truly integrated system developed by one company, it is easier and faster to access shared data. However, it is doubtful that the BI tools themselves are the best place to focus on creating a unified data structure. As a general rule, the best way to provide shared data is at the data warehouse level, not the level of the data mart or reporting system. On the other hand, there are situations in which data warehouses are just too cumbersome to keep up with the needs of a rapidly changing company. In these situations, a semantic layer with direct access to source systems or smaller scale but more flexible departmental solutions might be feasible.

Furthermore, metadata management with an exchange of objects and information between different tools and layers of the architecture can be of benefit to developers and end-users. Since metadata standards are rarely available, or followed by the vendors, it is usually much easier (but not always possible) to exchange metadata between components of one vendor’s suite.

Functionality

The strongest case for the best-of-breed solutions is richer functionality. Some areas of business intelligence are becoming commoditized, which reduces this argument, but it still exists. This is especially true of planning and other content rich CPM applications.

Best-of-breed vendors are more likely to provide timely updates of customer-specific features. Specialized systems may be technologically advanced since it takes longer for companies to re-write a whole suite of applications. But the best Web interfaces come from larger companies, and smaller vendors have difficulty providing highly scalable solutions, either in terms of quantities of data or in terms of the number of users.

It is difficult to find a single vendor that meets all the needs of the enterprise. This reduces the number of choices a company has when it is selecting a strategic partner.

Just because a suite is good in one area, it does not mean it is good in another area. In fact it is quite common for vendors to have functional “sweet spots” with their tools while other BI application areas might be rather cumbersome. Choosing a single vendor to fulfill all company requirements runs the risk of having second rate software in certain areas.

Administration and Maintenance

Suites can reduce the complexity of maintenance, as they usually have centralized administration tools and fewer internal interfaces.

  • IT requirements are reduced when a single vendor is used because there are fewer servers to maintain. Each application has its own specific overhead.
  • Upgrading to new versions of the software is easier if there are fewer products involved. Upgrading is always an expensive and difficult process, and one that can be risky. Many software vendors recommend several upgrades per year. By keeping the number of different products level, companies can reduce the costs incurred by upgrades.
  • Collaboration between separate departments is easier if the same software is being used.
  • If all employees are using the same software, it is easier to define a single point of truth and to avoid multiple definitions and calculations of the key performance indicators.
  • Most vendors who offer wide-ranging products have gotten so big by buying other companies that their software is not very well unified.

However, it is worth noting that all of these arguments apply to situations when more than one application or more than one tool is being used. For smaller scale solutions, best-of-breed products tend to be easier to administer, and in many cases the business users can maintain their own installations. The problems arise when large numbers of such solutions are in use in the same company.

Conclusions

The following table gives an overview of the advantages and disadvantages of each approach.

Best-of-breed

Company standard

Purchasing

More precise feature selection

Low price, but lack of transparency. More leverage for the customer

Training

Tends to be more expensive, and support a smaller community

Knowledge tends to be more widespread, and can be reused in other projects

Administration & maintenance

Simpler, and often more accessible to end-users

Better suited for enterprise-wide applications

Functionality

The best available, but usually missing some application types

Wide-ranging, but often missing key features in particular areas

Figure 1: Comparing standard software with best-of-breed

There is no general recommendation as to whether to use a best-of breed solution. The final decision will depend on the customer’s size, culture and management style. Decentralized organizations or organizations that often change their structures will find it difficult to maintain a single vendor over a longer period. In stable highly centralized organizations, suites are easier to maintain.

The current market situation for larger customers is that many are beginning to adopt an integrated systems strategy and are filling in the gaps with best-of-breed systems. But the systems they are buying are often the cobbled-together suites that were created in the wave of takeovers that peaked around 2007 and has continued to affect the market. In fact, there are very few options on the market today for truly unified suites so, at least to some extent, best-of-breed specialists are to be found in nearly every company.

 

BARC Logo

BIaaS – Business Intelligence as a service

Background information on SaaS

The idea of hosting software off-premises is not new. In fact, software as a service (SaaS) is just a rewording of an older term referring to companies providing the service, application service provider (ASP). ASPs were a very popular investment target during the dotcom era, but not much ever came of the business model. Like so many Internet-based startup ideas in this era, most of these companies disappeared. In fact, the only major exception to this rule was Salesforce.com, which used to refer to itself as an ASP, before the SaaS term became popular. And indeed Salseforce.com is often taken as an example in discussions of the future of Business Intelligence as a service.

But why was the ASP wave a failure and why should SaaS be different? Even the often quoted success of Salesforce.com is far removed from the company’s once loudly trumpeted goal of replacing all the ERP features of SAP and PeopleSoft. Furthermore, do the forces that could make SaaS more successful than ASP also apply to the BI market? After all, Salesforce.com is arguably more valuable to users as a means of automating sales processes than as a means of providing timely analysis for strategic or even tactical decision making. It can be viewed as an ERP company, and the history of ERP companies attempting to conquer the BI market is not particularly encouraging. So even if SaaS spawns more companies like Salesforce.com than ASP did, it does not necessarily mean the SaaS companies will be BI companies.

SaaS is not about the software that is delivered, but how it is delivered. The resources of a cloud environment such as storage, processing, memory, and network bandwidth are generally pooled and delivered in a multi-tenant model. Thus, the discussion must focus on the advantages of the delivery model itself.

One key to understanding SaaS is that there are two different ideas involved that are often mixed freely: the issue of how the software is hosted and how usage is billed. The internet allows companies to deliver software off-site, and this is the essence of cloud computing. Cloud computing environments require broadband networks to provide timely response times especially for users of thin clients. But an SaaS application is provided on demand, meaning that the users only pay for what they use, not a flat fee. This is an additional feature that is not necessarily a given in a cloud environment.

Self Service

Self service is a popular idea in the BI space. In fact one of the key ways of speeding up BI projects and providing more flexibility is removing IT from the loop, thus empowering users to carry out data analysis on their own without being tied to slow bureaucratic IT processes designed to maintain the integrity of enterprise transactional systems.

And there is no question that SaaS provides a kind of flexibility that normal BI tools cannot. In fact one of the key features in cloud computing is that systems can scale up and down seamlessly, and this can be done in the simplest cases without involving complex processes on the customer side. Obviously, more complex processes such as user management are not simplified by changing the platform. This includes role management and security management.

This latter point leads to a key question: does the added flexibility provided by BIaaS actually address the key flexibility issues that BI users face? And the answer is probably no. For it is true that BI users commonly have difficulties with enterprise driven BI initiatives because of the lack of flexibility — but the lack of flexibility relates to data access, data modeling and other site specific issues that BIaaS does not improve at all and in some ways may even make more difficult.

As a result, the issue of BIaaS versus on-site hosting is probably not a key factor in choosing a product. Instead, prospective buyers should weigh issues such as costs, performance and features when choosing a BI tool. A possible exception involves data acquisition for BI tools. It could be useful to combine a BIaaS solution with a data as a service solution. Hosted data management could be interesting, if the data originates from other SaaS solutions.

Project types

CRM, which is often seen as closely related to BI, has seen a good deal of success in the SaaS sector. The most commonly cited example is SalesForce.com, but it has a growing field of competitors. But is the Salesforce model applicable to BI products?

Larger companies have been slow to adopt BIaaS. The exceptions tend to be vendors such as Nielsen and some logistics companies, but these companies supply data as well as a service. One commonly cited reason is the lack of features in BIaaS, but this is probably not an issue with the delivery model itself. As new vendors enter the market, the range of sophisticated feature sets has improved, and BIaaS is is now available for corporate performance management suites and on-demand predictive analytics.

Another major issue is data integration. It is not at all clear how an application that resides in the cloud can efficiently process on-site data. This is particularly problematic for real-time applications, which would need to support a constant stream of data being to be moved off-site and back on. For the time being it is not unreasonable to say that BIaaS is not suitable for real time BI.

There are also very few planning tools that use BIaaS. In fact, dashboards, reporting and simple analysis are by far the most common applications.

BIaaS is often combined with other, usually transactional software as a service solution. The hosted BI solution reports or analysis data that resides in the cloud so neither software nor data is needed to be transferred on premise.

Costs

The main argument offered by vendors in favor of SaaS BI is that it is less expensive than the more typical on-premises software. The upfront capital costs of hardware contribute to these savings. The other major factor is that the company is not required to operate separate servers for the BI application.

However, both of these savings result from the investment required to own and operate the hardware for BI software. Similar savings can also be realized with strictly in-house approaches such as server virtualization. Furthermore, the use of Web technology has already eliminated one of the key costs of BI initiatives, which is the cost of rolling out the front-ends to large numbers of users.

Another key cost advantage that BI SaaS brings is the ability to switch off the service when the licenses are not required. This is a major issue because it is very hard to anticipate BI tool usage in larger organizations, making BI shelfware a widespread problem. But it is also an interesting subject for applications that are only used intermittently, or seasonally. Planning tools are a good example of this kind of software.

Agility

Many, especially larger organizations are burdened by the time and effort needed to keep their software up to date. The installation of updates and upgrades can take several months until they can be used in a productive system. BIaaS solutions often offer frequent updates and new features. They relieve IT organizations from cumbersome test cycles creating additional maintenance work and gives users the opportunity to benefit from new features and functions faster.

Another important issue that sometimes confronts large projects is the sudden or temporary need to significantly increase the data storage for the application. Because cloud solutions used shared users farms with very large capacity, this is much easier to deal with than on dedicated servers.

Security

Security is one of the key reasons why potential customers have shied away from BIaaS. Companies tend to be vigilant about securing their data, and the strategic data often found in BI applications is particularly important. Vendors recognized this challenge and have addressed it with third-party auditing and certification. The two best known certifications are SAS-70 and Systrust. Both certifications provide information to potential customers on how the vendor maintains client data. They also address issues arising from governance, regulatory and compliance (GRC).

At one level, data stored with outside providers may well be even more secure than data stored by companies themselves. This is particularly true of smaller organizations that need to work on a shoestring budget. The large server farms run by cloud operators tend to offer much more security than most companies can.

Nevertheless, there is still of good deal of nervousness in the market about using SaaS to store key data. In particular, companies seem to worry that data stored in the cloud will leak to other users. Perhaps one reason for this is the fact that SaaS providers are perceived as having their origins in the consumer business rather than the secure networking industry. Most potential users may associate BIaaS with products like Facebook, which is not known for its privacy features. Many users have also noticed that Google reads their mail to provide advertising in its email client, which is hardly reassuring. And the recent decision of Amazon — which is certified as a trusted provider — to bow to political pressure and simply cut off Wikileaks.org without notice also sent a shiver through the market. Privacy concerns are further spurred with some legislative demands, such as a law that crucial personal data may not be stored outside the country. This obviously complicates the use of service providers with a general purpose, globally active infrastructure.

Another issue that creates risk is the lack of control over data. This can only be addressed on a vendor-by-vendor basis by careful analysis of the license agreements made in advance, but it undermines the argument that BIaaS companies are more flexible. Exporting data stored off-site may bring unexpected costs. Furthermore, it may be that providers limit the usage of the data they store, and additional licensing for uses not originally planned may be expensive.

BIaaS-specific selection criteria

Companies looking into any BI solution have to analyze a wide variety of issues. These include performance features and usability of the product. For more information on general best practices for purchasing software, see our guide to purchasing BI software.

However, there are certain aspects of SaaS offerings that merit special attention. These include the following:

  • Security: Security is a key consideration because BIaaS involves moving strategic data off-site and storing it at a third party location. Outsourcing IT is nothing new, but in this case the issues of security seem more acute than in most, because the data used in BI applications is specifically modeled to be easy to analyze and reuse.
  • Performance: Performance may well be a major issue when using off-site tools. In particular, this method of product delivery introduces additional performance risks into this sensitive area, and the risks may be more difficult for the customer to control that if the problems were on-site. On the other hand a good service level agreement – if the provider agrees to one – might hand the task over to the service provider to tune the systems according to performance demands.
  • Training: There may be a lack of resources for a BIaaS provider in the customers locale. Part of the price advantage of BIaaS is that the expense of creating a geographically widespread organization is reduced. However, this also means that customers have less access to experienced consultants.
  • Service levels: One of the key long-term costs of BI software. Since BIaaS may not have an organization in your country, this may be a problem.
  • Customization: BIaaS products are often distributed as finished applications with little flexibility. Creating an on-line development system with the power of locally hosted applications represents a significant technical challenge.
BARC Logo

Business Intelligence and Web Front ends

Introduction

The complexity of Web technology has increased greatly since the first simple browsers became popular in the nineties. The original idea was very simple: Web servers provided access to simple formatted texts which could contain hyperlinks to other texts, creating a worldwide web of interlinked documents.

Since then the technology has flourished in many ways, but it is possible to cut through the almost impenetrable thicket of technologies by classifying the developments. Once you do this, things look much simpler.

  • The original hypertext markup language HTML has gotten much more complex, allowing much better formatting and interaction. Most importantly, HTML pages can embed various non-text objects, a capability that was originally intended for displaying images.
  • Originally, HTML files were stored on the Web server and served up on request. This has changed a great deal, and now most Web pages are provided by content management systems which may serve static HTML files or assemble the HTML files based on the requests that come from the browser using data held in various databases. Thus the business intelligence software that creates Web content often comes with a simple BI portal.
  • The HTML pages now embed scripts that run in the browser and allow the documents to be dynamic – that is, to modify themselves on-the-fly while the user interacts with them. The language of these scripts is JavaScript, which was originally invented by Netscape. Netscape created a lot of confusion when it chose this name, because JavaScript has nothing whatsoever to do with Sun’s Java language, except for some superficial similarities in the syntax.
  • The objects embedded in the HTML pages have gotten more and more complex. The original images have now been replaced by various interactive objects. The Java programming language actually embeds a complete virtual machine – a sort of toy computer – into the document. This approach has been copied by several vendors, including Microsoft with its .NET technology, which is nearly a clone of Java, and increasingly by Adobe’s Flash technology, which was originally intended for cute little scripted animations but now has many more capabilities.

The main point of Web technology, from the point of view of BI deployments, is to reduce the effort involved in rolling out an installation of the software to every user. Web front ends also make it possible to use the software outside the company in the extranet. Installation is usually automated in larger organizations, but even then additional applications cost the IT department additional expense and risk, and these costs are often passed on to the other parts of the organization. So called “zero footprint” solutions do not require any installation at all, reducing the total cost of ownership of the tool, assuming all other factors are equal.

That said, there are other costs in addition to the rollout costs. For example, zero footprint deployments require more processing to be done on servers that would not otherwise be required, and more data to be sent across the network than would be the case with a good client/server application, which has local caching, generates charts locally, a local calculation engine and so on. This increases hardware and network costs, while reducing the quality of the user experience.

But on balance, Web-based tools are most popular in enterprise environments, where IT issues dominate the discussion. Note that this discussion of deployment costs contains an implicit criticism of the personal computer with desktop software installations as an enterprise tool, which is why the Web is widely seen as an alternative or even a threat to Windows. By the same token IT departments which are skeptical of Windows technologies tend to strongly favor Web-based applications.

At the same time, IT departments tend to view programmed components embedded in Web pages with a good deal of suspicion because of the potential security issues involved. Every time users load a Web page, they are loading code from a server (potentially one outside the company) into the company’s network – a nightmare for busy administrators. Microsoft’s ActiveX in particular is a target of IT ire. The problem is not that there is anything inherently wrong with ActiveX – in fact it is heavily used in Windows. Furthermore, in an intranet that is the typical environment for business intelligence solutions, all the code being loaded is already on the company’s network. The only difference in an intranet environment is that some of this software runs in the browser, and some on the Web or application server. The problem is that if you allow one ActiveX object to run on a browser there is no satisfactory way to prevent other potentially malign ActiveX objects from entering from outside as well.

For the end user Web-based BI front ends offer little in the way of advantages over Windows. Because Web applications tend to be less interactive than Windows clients, and tend to have restricted access to local data sources for security reasons, there is a strong tendency for business users to prefer Windows client tools. Here’s a brief list of the main issues to be kept in mind when selecting a Web-based business intelligence tool:

  • Web applications allow the user to access the application from any networked PC, not just his own PC which happens to have the software installed.
  • BI applications running in browsers lose usable screen space to the browser, which displays tools bars, tabs, and an additional menu bar which the application does not use. This loss of screen space at the top of the screen is exacerbated by modern wide-screen laptops which have shallow screens that were originally designed for the larger television market but have spilled over into the computer business because they are already mass-produced.
  • Unlike Windows applications, Web applications often fail to take advantage of wide or tall screens, so they end up wasting even more screen space. All this forces users to scroll sideways or up and down much more than in a decent Windows application, thus reducing the usability.
  • The menu of the browser creates more problems than just taking up valuable space on the screen. Users often confuse the browser’s back button with the application’s undo button. If they accidentally hit the back button, they may be taken right out of the BI application, instead of just undoing their most recent action.
  • The same confusion between browser functionality and application functionality makes it difficult for users to control font sizes and other aspects of the user interface.
  • Printing may be an issue and involve server-based PDF generation. In fact, creating a PDF file on the server and presenting it in a new window for the end-user is about the only really viable way to print a document presented in a Web interface. A few applications attempt to print in the browser but this is unlikely to work well since it cannot take advantage of locally-installed fonts, or wide page capabilities. Again, all this can be confusing to end users to whom we will seem logical to use the print function built into the browser.
  • Performance will usually be much worse than with a good client/server or local Windows application. This is a serious issue for BI deployments, as The BI Survey regularly finds, and performance issues are a major cause of user dissatisfaction.
  • Web interfaces often lack strongly interactive GUI features, like double click or providing context menus when the user right clicks an object. Drag and drop is another form of interactivity that is often missing or weakly implemented in web interfaces. The lack of context menus is most noticeable in supposedly highly interactive Flash interfaces. It is only one example of a common problem that is often unclear whether user actions invoke browser functions or application functions.
  • It is hard to drag content from the Web applications to local PC applications, such as Excel. This may require special features in the application to support such links, which would be natively available in Windows applications.

As Web usability continues to improve these GUI arguments are getting weaker, but they still apply to the self-service BI market. For security reasons, Web-based tools usually do not have access to local files, and in self-service BI scenarios, end users often provide their own data management and prefer to remain independent of IT. Furthermore, as convenient as Web user interfaces are to IT departments, they are more complicated to administer in a departmental environment, because they require a Web server.

Common Web technologies

Business users facing a decision on which software to choose are often confronted with a slew of formats to choose from, and little guidance on their differences. The following provides information on the most common Web formats. Note how Microsoft has brought out its own version of each technological idea. This phenomenon is related to the perceived threat of the Web to Windows, mentioned above.

ActiveX

ActiveX elements are reusable native Windows software components based on Microsoft’s (pre .NET) COM technology. Internet Explorer allows them to be embedded in HTML pages.

AJAX

AJAX is a special type of advanced DHTML that allows Web pages to get data from the server incrementally – without refreshing the entire page. AJAX is a major improvement in interactive Web pages, because refreshing the entire page is slow and causes the browser’s screen to go briefly blank, interrupting the user’s interactive experience.

DHTML

DHTML is short for Dynamic HTML, which is HTML enhanced with JavaScript to make it interactive. The scripts actually modify the content of the document while it is being displayed and can react to user actions to provide interactivity. DHTML is the format most commonly referred to as “zero footprint”. The scripts that make these interfaces dynamic and interactive are actually executed in the browser, not in any add-on object. This is where the term zero footprint comes from — as long as the user has a browser that complies to Web standards they won’t need any additional software to make the application work. DHTML interfaces are somewhat clumsy to program however, especially because the JavaScript language was never really intended for creating large applications. They also tend to be plagued by minor bugs caused by slightly incompatible browsers. However, in recent years this technology has gotten surprisingly good. The most spectacular implementations include a small amount of data stored in the HTML file itself, which allows this technology to be used to provide off-line capabilities.

Flash

Originally a format for vector-based animations, Adobe Flash is now a general platform for creating small, highly interactive Web applications and games. It requires an add-in for the browser to execute, but most IT departments do not object to the Flash add-in the way they would object to other add-ins, especially ActiveX add-ins.

Some of the most spectacular new BI interfaces are created using this technology. One feature that is typical of these interfaces is the glazed jellybean look that the objects sometimes have. Flash interfaces also tend to be highly interactive, allowing drag-and-drop and lassoes for grabbing multiple objects. But they do not have the Windows standard context menus available by right clicking the mouse. They also feature animations so changing the selections results in the charts smoothly morphing from one state to the next instead of just switching from view to view.

These objects can also store quite a bit of data. Several BI companies have made use of this ability to create a limited off-line functionality based on Flash.

Flex

Flex is a software development kit for Flash. The term is often used interchangeably with Flash.

HTML

A simple syntax for describing page layouts featuring links to other documents. Browsers convert HTML into formatted text.

Java

Java is Sun Microsystems’ programming language based on a virtual machine. The virtual machine has its own internal language which mimics the machine language of a real computer and provides most of the services a computer with an operating system would provide – it is a closed world. The Java virtual machine supports any programming language that compiles to its internal syntax , but the Java language is almost always used. Sun has now been bought by Oracle, so this key technology now belongs to them. Because the Java program is completely executed inside its virtual machine, Java programs run the same way on any computer that has Java virtual machine software installed. Java applets are complete virtual machines embedded in Web pages, but Java programs also exist as stand-alone tools outside the browser. Java programs also run on mobile devices.

JavaScript

JavaScript is a scripting language that runs in the browser itself. JavaScript is very different from Java although it does have a similar syntax. The name was a marketing ploy by Netscape back in the days when Netscape was battling Microsoft in the browser wars. Microsoft also created a version of JavaScript. The Microsoft version is called Jscript.

.NET

.NET is Microsoft’s Windows-only near clone of the Java virtual machine. .NET comes with compilers for multiple languages including C++, Basic.NET and the Java-like C#, but all these compile to the same internal format. Microsoft created .NET after losing a lawsuit on the future of Java to Sun. .NET built on Microsoft’s excellent development tools and has been quite successful. It is now used for building many Windows applications, but not for building Office or Windows itself. It has not caught on as Web front end.

Silverlight

Silverlight is a Microsoft platform based on .NET that closely imitates Adobe Flash. Silverlight uses Microsoft’s newest .NET visualization, and can also be viewed as a new viable Web front end for .NET.

XHMTL

XHTML is a version of HTML that is cleaned up a bit to conform to the more general XML format. XHTML is not really of any interest from the application user’s point of view.

XML

XML is a markup language for describing objects. Even in the hype-rich Web environment the amount of hype surrounding XML is remarkable. XML is not really a standard, as is often claimed, but simply a way to make non-standard content a little easier to deal with.

BARC Logo

Relational Databases

Relational databases have been used by businesses for decades and they have steadily developed the features required for large-scale implementation, including scalability (in terms of user count and database size). The idea of relational databases was first suggested by Edgar Codd in 1969, and by the nineties, they had largely replaced hierarchical and network databases for business purposes. But the original definition is excruciatingly abstract, and no Codd-standard “true” relational database has ever been commercially released, although there are academic implementations. Textbook discussions of relational databases descend into pure math almost immediately. For practical purposes, relational databases are databases that can be queried using SQL, but purists point out that the current ISO standard definition of SQL does not mention the relational model and ISO SQL deviates from the relational model in several ways.

Roughly speaking, a relational database is a set of data connected by relations. To take a simple example, an invoicing system could have one table with a list of invoices and another with a list of customers. Each table has one record (referred to as a tuple) per line, and defines a relation – for example, the invoice number is related to the date the invoice was issued. In fact, relational tables are often referred to as “base relations”.

But there are other relations in the database. The data in the two tables mentioned above are also related to each other by customer numbers, which appear on each line of both tables. To see how much a given customer bought in sum, filter the invoice table by the customer number and sum the total column of the result. To get the address of the customer for a given invoice, filter the customer table by the customer number in the invoice and look at the address column. These are derived relations, more commonly known as views or queries.

Relational databases are now so widespread that it is hard to imagine a modern company without one. But it is not just the details of the relational model which give them such a firm position. The ubiquity of technical skills required to administer relational databases and a huge number of applications that use them guarantees that relational databases will be used for many years in the future. Another strength of relational databases is the widespread use of SQL, which makes it easy for an application to work with multiple relational databases. This was not true of hierarchical databases. Establishing SQL as a standard probably led to the development of the whole ERP industry and the rest of the OLTP applications market.

One of the great advantages of relational databases is their ability to support transactions. The physical storage of relational data on the disk is usually by row. Row-based databases are better for short rows and situations where all columns are needed in most queries. This applies to transactions, which often write all the columns of a new row at the same time. However, it is worth noting that key parts of transactions are actually add-ons to the basic relational model and are achieved in the following steps:

  • Lock affected areas of the database until the transaction is complete
  • Log all changes to the database in a separate log file
  • Write back the result immediately
  • Roll back if the transaction fails by undoing the changes recorded in the log.

Relational databases used for large scale business environments have highly refined versions of these extra features.

A variety of other features have clustered round the relation concept and allowed the databases to be used in many different contexts and to scale almost unimaginable heights. Strictly speaking, these additions have nothing to do with the relational model and, indeed, non-relational databases also have similar capabilities. So the relational model owes much of its success to the development of external concepts.

  • Highly developed indexing methods: Indexing sounds boring, but it has been a rich field for technical innovation over the past few decades. Most modern relational databases use multiple indexes simultaneously and indices often consume more storage space than the raw data in the database. Part of the database’s internal query optimization is an automatic on-the-fly selection of which index to use. Individual index types tend to work best in specific use cases. For example, bitmap indices are mostly commonly used for analysis on tables with many columns of yes/no data. They are not much use in other applications.
  • Redundancy: Another feature of modern relational databases is the ability to spread data storage across multiple servers, and provide redundancy for safety.
  • Multiple interfaces: Relational databases come with their own proprietary interfaces, but there are also standard interfaces to allow third-party products to access them generically. Providing multiple technical interfaces, all of which support SQL, is another way of opening up the database to as many third-party products as possible. The most common interfaces are the Windows-based ODBC and the Java-based JDBC. As a rule, the native interfaces provide the best performance.
  • Stored procedures: Stored procedures were originally introduced by individual vendors and are now part of the SQL standard as optional features. They are stored in the database and are fragments of procedural code that can be called from the client and executed on the server. They allow highly complex applications to run on the database server itself. Unlike SQL itself, which is reasonably well standardized, the syntax for procedures to manipulate data is completely different in every database.
  • Large-scale databases also take good advantage of multi-threaded servers, so processing loads can be spread across multiple CPUs.

Other types of database also have these features, and relational databases are not the only type around. IBM IMS, a hierarchical database, is probably still the leading OLTP database, and is reputed to be IBM’s biggest billing software product. IDMS is also still around, as well as Adabas, Model 204, and other non-relational databases. Of course, some of these now support SQL queries, even though the underlying storage model is not relational.

One of the weaknesses of relational databases is the way that related data tends to be spread around the database. Even the simplest invoicing system would require four tables to be useful: a customer table, an invoice header table, a product table and an invoice line table. The latter is needed because each invoice can have multiple lines. Printing the invoice means filtering all four tables and combining the results into a single report. And real-world applications are much more complicated than that.

Normalization, which results in splitting the data up into small tables, is the usual mode for storing relational data. Normalized databases are optimized to avoid inconsistencies in transactional systems. But data that is spread around this way is not at all easy to analyze by hand. Most applications based on relational databases provide the user with highly abstract views of the data, which make sense from a business perspective but are very difficult to see in the underlying data. The queries needed to analyze transactional data tend to be extremely complicated because related data is distributed around the database as a result of the normalization. The queries are typically difficult to define, requiring multiple joins and a complete understanding of the complexity of the schema. They are also quite slow, because the database has to join the data from the various tables back together to answer the query.

Another problem with analyzing transactional data directly is that the data model is not designed to return aggregated views such as the sum of all the transactions in a given time period. But this kind of aggregation is typical for analysis. Attempting to carry out analysis directly on data in this format results in poor performance. It should never be attempted on anything but the smallest transactional systems.

The data structures in applications built on relational databases are sometimes very complex. Large ERP systems can contain tens of thousands of tables. Vendors of business intelligence software commonly say they support various relational databases, but they rarely mention the fact that without semantic support the ability to connect to a complex relational database is almost useless. In other words, if the software that is extracting data does not know where to find the data in the right form, then the user will have to know where the data is, and build queries by hand. This issue can lead to significant cost increases in business intelligence projects.

To make analysis of transactional data feasible, many companies create data warehouses. This involves extracting data from its original source, transforming and cleansing it and finally loading it into a new database in a form that is suitable for analysis. Data warehouses are usually stored in relational databases and are often the first place in the enterprise where basic analytic concepts such as a time axis appear. The data management processes involved can be quite complex and often involve exporting the data to large text files and re-importing it into another database. The issue of data quality is particularly vexing. In many cases the data in the warehouse comes from multiple, otherwise unconnected, sources.

However, data warehouses can be very large and unwieldy. To make analysis practical, small subsets of the data are often extracted into so-called data marts for analysis purposes. Data marts are sometimes stored in relational databases, but in many cases other forms of database are used, particularly multidimensional databases. The idea here is to trade off details and complexity for speed and simplicity. Attempting to retain the complexity of the underlying ERP system in the data mart is trying to have your cake and eat it too, and is probably doomed from the start. Well-designed data marts always lack most of the complexity of the transactional world they are based on. They are optimized for accessibility, and usually hold aggregated data only. Because they feature reduced size and complexity, data marts can often deliver query results very quickly, making them popular with business users.

General-purpose relational databases designed for enterprise transactional systems are not always the best choice for storing data warehouses. The major vendors of relational databases have tried to optimize their products for this market, but face stiff competition from a relatively new technology in this area – data warehouse appliances, which offer a combination of hardware and software for high speed performance on very large quantities of data, sometimes even in the petabyte range. Data warehouse appliances use parallel architecture with optimized custom hardware and operating systems to deliver performance. Some of the systems also come with chips built into the disk controller that carry out simple filters on the data as it comes from the disk. In these systems only a fraction of the data needs to be loaded and processed by the server itself, and appliances typically make much less use of complex indices, which means they also store data much more compactly. These innovations allow appliances to process huge quantities of data on relatively cheap custom hardware.

The physical data storage model also varies from database to database. Again, this is an implementation detail and, strictly speaking, it has nothing to do with the relational model itself. The standard way to store data in transactional databases is to store by row – all the data in the first row of the table, followed by all the data in the second row, and so on.

In columnar databases, the data is stored by column, not by row, and this has an effect on how the database is used. Storing by column makes typical transactional operations such as writing or reading every column of a single row from a large database slow, so columnar databases are not well adapted to typical transactional systems. To see why, consider a command to add a new invoice to a system. It adds a new row to the invoice table, so it would require accessing and changing all the columns of the entire table, because the table is stored by column. So typical transactional operations are fairly clumsy on columnar databases. But reading one or two columns from every row in a table is fast using a columnar database. Reading a few columns and ignoring the rest is typical for OLAP operations. Because columnar databases allow you to ignore columns they are better for databases with large numbers of columns and allow separate indexes per column for optimal access speeds. They are more efficient for adding new values of a column for all rows because only one column is touched.

Most columnar databases support SQL. The distinction between row-based and columnar databases is in the storage engine. The queries may use SQL and optimize to the storage engine, for executing the actual write and read actions. In fact one columnar database, InfoBright, is actually delivered as a database engine for MySQL and accessed using SQL queries in the MySQL dialect.

BARC Logo

Business Intelligence and Unstructured Data

Introduction

As a rule, analysis software requires much higher data quality than operational software. A typing error in a customer name in an invoice is not enough to break the transaction in most systems. But when it comes to analyzing the data, much more precision is required. In fact, the issue of data quality was identified by The BI Survey 9 as the most common problem in business intelligence projects.

So what is to be made of the idea of business intelligence and unstructured data? The term “unstructured data” is not particularly precise, but in the end it means data that is difficult to deal with, usually because it has no data model. This certainly does not sound like a very promising area for analysis. Unfortunately, a lot of the information available to companies is in unstructured data, although it is hard to say exactly how much.

In fact, business intelligence tools cannot directly analyze unstructured data directly. Any project of this type has two distinct stages – in the first stage, specialized software analyzes the unstructured data, reduces it and produces a data model that a BI system can deal with. Then the data can be analyzed using business intelligence tools. Text based unstructured data is by far the most common, but there are many other forms. In this article, we will only discuss text analysis.

Text analysis tools

Text analysis tools process texts and add metadata for analysis. The metadata consists of semantic tags to the documents. The resulting data is often stored in a search engine style tables – obviously, there is a large overlap between search engine technology and BI for unstructured data. The analysis software defines clusters, which are sets of data with the same semantic tags. This process removes the “noise” that natural language texts inevitably contain. The clusters are then treated as the objects of analysis. So the final analysis carried out by the end users does not actually look at the raw data itself. Instead, it looks at these cluster objects, which are abstract and modular enough to be used for high level analysis.

Not all companies that want to analyze unstructured data need a lot of complexity in the front-end. One typical application of unstructured data analysis is finding out whether consumers have a positive or negative attitude towards a brand. This can be achieved to a certain extent by analyzing comment on public Web forums. There is a lot of complex technology involved in this process, but the final results are quite simple to present – essentially a thumbs-up (or thumbs-down) for the brand in question.

There are various ways to enrich the data model to make it more susceptible to typical OLAP technology. In some cases, the clusters are arranged hierarchically, so users can drill down from general terms to terms that are more specific. For example, a car company might analyze comments from on-line forums about the opinions car owners have. A high level cluster in this case might be engines, and from here, the user could drill down to engine problems, or performance, or other concepts. It is also common to add a time line to the data. In this case, the analysis is often a simple cluster frequency analysis over time. From the point of view of OLAP analysis, this is not very complicated, but even something this simple can be extremely valuable to a company. Other examples of where the same type of analysis could be useful include the analysis of letters from customers, or call center data.

More than simple string matching

Unstructured data analysis solutions start by using a natural language processing engine to derive the clusters. Measuring keyword density is the most important method in this type of analysis. The first step to measuring keyword density is to apply a so-called stop word list to a text. Stop words are words that do not carry enough meaning to help establish a context. A typical English stop word list would begin “a, about, after, again, against, all…” The more commonly a word occurs in a generic text, the less interesting it is for analysis of keywords in a specific text. Usually the stop words are not removed from the text, just replaced by a single symbol to maintain the text structure.

Text analysis engines also use word root analysis to group words by ignoring inflections, so “engine” and “engines” can be treated as the same keyword. Synonym word lists, which usually depend on the application, are used in the same way to reduce the total number of terms. Synonyms are a typical example of the linguistic noise that makes texts difficult for computers to analyze.

Once the texts have been cleaned up, the system can carry out a statistical analysis of the resulting set of interesting words. This is often done using a word list as a basis – depending on the business. For example a company could look for mentions of its own name, and try to determine if it is associated with positive or negative terms. The document needs to be divided into segments in some way, or the system has some statistical measure of the nearness of terms to one another in the text.

Vendors of text analysis systems tend to adopt one of two approaches: a linguistic approach or a statistical approach. It is difficult to make a general statement on which is better, and companies need to carry out a proof of concept to see which makes more sense in their specific situation. In many cases, the two approaches complement each other. Smaller specialized companies are often strongly oriented towards one approach or the other.

Training Phase

Regardless of the approach the vendor takes, analysis systems for unstructured data all require a training phase. When the project begins, there simply is not enough example data in the system to carry out reliable analyses. Companies need to develop custom dictionaries to get the best results. This means that the projects inevitably take months to get working. And as with any IT project, long deployment times tend to result in changing requirements and disagreements on the goals. Unfortunately, there is no simple solution to this problem. However, one possible approach is to use knowledge already available to the customer to structure the dictionary. In other words, instead of starting the project with a completely blank slate, companies tend to bring in a clear list of concepts and key words they would like to analyze.

Another issue that is related to the amount of predefined information that flows into the system is the balance of flexibility and ease of use for the end user. Just as in business intelligence, the most flexible system is not always the best, because many users require some guidance when navigating complex data.

Solutions

The solutions we have seen are often produced as a cooperation between two or more companies, with a BI specialist providing the front-end and a specialist for unstructured data processing the raw data to feed the system. It is typical for the more specialized, linguistics-oriented systems that the native front ends are simple and do not offer as many functions. Vendors who provide statistical analysis are often accustomed to producing data models for BI front ends, not least because this type of technology is older, more generally applicable and better established in the market.

BARC Logo

Choose a product – not a vendor

Bigger is not always better – especially in the software industry. When companies seek assistance in selecting a software vendor, investment security is always one of their top concerns. Many are wary of small software vendors due to the potential risk of bankruptcy or competitor buyouts, which could later result in additional investments or expensive migration projects. Although these concerns may be valid, the conclusions made by many decision-makers are just plain wrong. Choosing a bigger software provider is no guarantee for a stable software solution. In fact, large vendors typically release products outside of their core offerings more slowly than small, specialized companies that have just one or two products. We regularly find that such non-core products are uncompetitive compared to products from specialists.

The Business Intelligence market is a prime example. No major player has a clean slate when it comes to product continuity. IBM has discontinued its Business Intelligence tools and DB2 OLAP Server – and not just the OEM components, but its own development as well. Oracle has also made a break from its own portfolio, and from some of its acquired Hyperion products, and now promotes the former Siebel BI front-ends that came with the Siebel CRM products. SAP also stopped development of all of its front-end tools and applications after acquiring Business Objects. Even Microsoft, which purchased Data Analyzer back in 2001, pulled the product from the market without further comment and, more recently, stopped development of its planning product PerformancePoint Server in 2009 after just two years on the market.

Another way to look at the investment security is the share of revenue that specific business intelligence tools have from the total revenue of a vendor. The whole of Cognos – one of the top 3 players in the BI industry with about $1 billion revenue before the takeover from IBM – now makes up only 1 percent of the total revenue of IBM. Similarly, the BI related revenues of Oracle and Microsoft are less than 5 percent of the total revenue. A large vendor is unlikely to find it difficult to discontinue one unsuccessful tool from a sizeable BI portfolio. More probably, that large vendor would swallow more vendors and use newly acquired tools to replace existing ones.

Of course, specialists also discontinue products in their portfolios. However, if a small vendor has competitive products that make up the core of its business, your investment is much safer there than with a weak product from a large company. Even if the niche player were bought up by a competitor, acquiring the technology or the good products would most likely be the driving motive. In most cases, the larger company continues to sell and develop the specialized products.

Before you invest in new software, ask yourself the following questions:

  • Does the software have functional and technological advantages over other products in the market, including those from less well-known vendors?
  • Is the software a core component of the vendor’s portfolio?
  • Is the product prominent in the vendor’s Web site, user conference program, marketing events and financial reports?
  • Does the software fit in the vendor’s overall product and market strategy?
  • Are there no overlapping products in the vendor’s product line that may replace it in the future?
  • Has the vendor invested in significant further development in recent years (or was the product just supported?)

If you have answered “yes” to all of these questions, the vendor offers the minimum required level of investment security. Size alone, however, is irrelevant when buying software.

 

BARC Logo

Formatting Reports in Business Intelligence Tools

Introduction

Reporting tools vary a great deal in the way they present data. The following table provides a simple overview of the basic layout types available.

Type Description Example
Canvas / Layers Objects are positioned absolutely on a canvas. For interactive reports, dialogs, etc. Sometimes for page based reporting. arcplan, IBI PowerPainter, Layout Painter Oracle BI Publisher
Grid Objects are positioned in a grid. Ad hoc browsing, simple reporting. Often includes cross tables. Sometimes recursive. Cognos Report Studio, Oracle EE Answers
Column Structure The column structure of the individual rows is predefined, but the rows can be freely defined. For custom schedulization in financial reporting and planning systems. IBI Matrix reports, Cubus, MIS Package
Band Divided into separate functional sections which are as wide as a document page and are stacked vertically in the report. Usually page based. Crystal, IBI Report Painter, MS Reporting Services, MS Access
Cell Based Spreadsheet function for each number. Almost always planning systems. Like a grid, but static. Applix, Infor Office Plus
Tiles The page is divided into tiles and objects stack in them. One version, columns, is usually found in portals. IBI Dashboard, Cognos Connection, Oracle EE Dashboard

Table 1: Layout Types

Combining Layout types

Recursion

These layout schemes are often recursive, meaning one occurs inside another. For example, grids are often placed side by side on a canvas, or in columns.

The Oracle BIEE Dashboard has three layers of recursion. The columns in the Dashboard are a type of tiling layout. They can contain an Answers report, which has another tiling layout. This in turn holds data grids.

Page layouts

The issue of whether the system is page-based is independent of these schemes. But columns are usually for portals, and bands are usually for page based reporting etc. There are several different ways to deal with pages.

  • Flow Layout – objects can be laid out on a page similar to how Web pages are designed. Report components such as tables, charts, images and crosstabs can be laid out in the report design
  • Tabular Layout – presented in cells within tables
  • Banded Layout – the report designer has complete control over the precise location of report objects and components to produce a highly structured report

Reporting systems intended for mass printed reporting should include a page concept in the report design, not just in the print preview. For example, a common requirement is page sums.

Pixel perfect reporting

Printed reporting systems are often referred to as pixel-perfect reporting. In fact “pixel-perfect” tends to be bandied about quite a bit in the BI business, and nearly all vendors claim to support it. Unfortunately, the term is not defined consistently.

The term “pixel-perfect” was actually invented to describe the behavior of a CRT screen – a screen is referred to as pixel-perfect if the resolution being used actually corresponds to the number of physical dots on the physical screen. It is also referred to as ‘native resolution’.

Later, the term took on another meaning. For gamers the idea of “pixel-perfect collision detection” means that the objects on screen act like they collide when they look like they collide.

But “pixel-perfect” has also crept into the area of reporting systems and taken on a somewhat vague meaning indirectly derived from the term’s screen based roots. In reporting software it refers to a high level of typographical control and graphics capabilities and to the software’s ability to create output suitable for volume printing, in particular highly formatted PDF files.

A good system that calls itself pixel-perfect should provide users with the features they would expect from a desktop publishing product. Among other things, they should have the features described in the following table.

Feature Description
Draw objects everywhere Products that arrange objects in a grid are not considered pixel-perfect systems. The ability to overlap objects, including text objects, and the placement of individual objects should – at least optionally – be independent of the other objects in the report.
Rich layout and printing options Printing at angles, and providing other rich graphical options with exact control of the output. This can be a problem with server-generated reports, where a Java server may have fewer fonts installed than a rich Windows client.
Dynamic Page Sizes The concept of a page should be built into the system, allowing the report designer to have tight control over the report’s behavior when the page breaks. Page sums are a typical print report option – another is dealing with varying screen and page (eg, A4 and letter) sizes.
WYSIWYG A pixel-perfect system would also be expected to provide the report designer with a clear visual clue as to what the final printout is going to look like, avoiding surprises such as unexpectedly overlapping objects in the final printout. In this sense the word is similar to the gamer’s idea of collision detection.

Table 2: Formatting features in standard reporting

The best standard reporting tools can be used to print custom page sizes such as the output of ATMs, tax returns, or other specialized formats. Most so-called pixel-perfect BI reporting tools don’t really quality for the term, but few BI applications need it. Applications that have to print checks or invoices may well need such precise accuracy, but management reports simply do not require so much precision. Trying to produce it is a waste of effort, and it is better for reports to be flexible in adapting to changing data structures.

Pixel-Perfect sometimes means transparent.

Examples

The following section provides examples of a few of the more important layout types.

Bands

Bands always have another more detailed method of laying out reports embedded in them. For example, each band may contain a grid or the bands may be placed on a canvas.

Crystal Reports was named after the bands (actually quantum states) in crystals – an engineering joke. The model is a canvas embedded in a band layout. The model requires no tables at all. The data is presented in fields and other objects (charts) that are placed in the band. The band is reproduced in its entirety for each row of the data. Microsoft Access is a widespread tool with a similar model to Crystal.

An example of the band model

Figure 1: An example of the band model

Hyperion Interactive Reporting’s standard reports are also a canvas in a band. However, the body band is only displayed once. The data is displayed in a table which contains all the rows, so there is less flexibility in positioning them. The band can contain several tables and charts.

Grid

A grid layout divides the canvas into rows and columns and allows content to be positioned in each individual cell. The best known example of grid layout is the spreadsheet, originally a simple system that only allowed a single number or text in each cell. Modern spreadsheets also support chart objects layered on top of the grid. Microcharts, which are complete charts positioned in individual cells, are much more in the spirit of the grid concept. General grids allow large objects and even entire tables in the cells as well as individual numbers.

Microsoft Reporting Services and IBM Cognos Report Studio both use the grid concept. Report Studio is recursive, meaning it allows grids within cells, and grids in those cells and so on.

Canvas

Canvas layouts are closely related to the idea of layers. They can be absolute layouts or allow objects to flow. Absolute layouts are only possible when the objects do not grow.

PowerPoint is a well-known example of the grid layout. The canvas offers most powerful way to locate objects precisely.

Tiles

Tiles layouts consist of a set of panes of various sizes and shapes

A tiled layout in Cognos 10

Figure 2: A tiled layout in Cognos 10

Column structure is a special form of tiling that is often used in web sites. The individual panes are arranged in columns. However, there is no coordination of the heights of the individual panes between the columns.

A search engine screenshot showing column structures. The same layout is often used for dashboards

Figure 3: A search engine screenshot showing column structures. The same layout is often used for dashboards.

BARC Logo

Proof of Concept

When implementing complex solutions, particularly in the area of BI, customers often run into the ‘tool assumption’ problem. Although they are not usually discussed in these terms, advanced BI products can be seen as rapid development tools that minimize the need for technical know-how. They work by limiting themselves to a specific type of application. This involves making assumptions about what kind of data will be processed and how it will be processed.

The difficulty that arises is that the assumptions made by the product designer are aimed at a fairly general market, which means that the products can do a lot of impressive things very well, but do not necessarily fill the specific requirements of each individual customer.

Unfortunately, the very features that make this kind of product so convenient to use the way it was designed to be used tend to make them difficult to use in any other way. In other words, when the salesman says in his presentation “All you need to do to define this type of report is press this button”, it often implies that there is no convenient way to define reports that are somewhat different.

BI software has come a long way, but the trade-off between convenience and flexibility still exists. A typical scenario is that a customer sees a presentation of a product, and buys the product because he is convinced that it basically fulfils his requirements. And in the project, it turns out that the product does indeed fulfil nearly all of the requirements. But as the project proceeds, one detail after another is discovered where the product does not quite work the way the customer expected it to.

Aside from performance, most of the quality problems and price overruns in BI projects are the result of seemingly small issues that the software platform fails to address, and have to be dealt with using complex workarounds. The customer will have to either accept these cost overruns or compromise his requirements.

To avoid this problem it is advisable to carry out a formal proof of concept workshop. Here, a short list of vendors are given a set of scenarios based on real data and asked to implement a small project.

A proof of concept should include the following:

  • A list of requirements for the vendors
  • A set of scenarios the vendors can implement in a day or two, which should include the requirements.
  • A final presentation by the vendors to the end users
  • A questionnaire for the end users to allow them to give their opinion of the results.

When carrying out a proof of concept, don’t forget the basics. For example:

  • About two thirds of the time spent on a typical BI project is spent on data import. Make sure the vendors can import your data.
  • BI projects often run into difficulties because the product runs into technical or security issues at the customer site that are not anticipated by the vendor. Allow half a day or more for installation.

If the vendor or other external implementers will be carrying out the project, then it also makes sense to check their technical and communication abilities. This is vital to the success of the project.

Most importantly, make sure the proof of concept offers you some kind of closure: when it is over, no matter what the results are, you should be in a better position to make a final decision about which vendors to remove from the list, and which to keep.