Data integration is a pervasive challenge faced in applications that need to query across multiple autonomous and heterogeneous data sources. Data integration is crucial in large enterprises, large-scale scientific projects, and government agencies. Data integration also holds the promise of fueling the next revolution of data content on the Web. This talk will review some the impressive progress on data integration made in research and in industry, but will argue that despite the progress, data integration is either still too hard for most users or does not address the real needs in applications. I will describe a new abstraction, dataspaces, that attempts to address these two challenges. I will give examples of data management at Web-scale at Google that motivate the need for dataspaces.
Download presentation fileReturn to Santa Clara Valley Chapter IEEE Computer Society page.