The application of big data integration techniques in real scenarios needs to address practical issues related to the scalability of the process and the heterogeneity of data sources. The Re-search Alps Project, a project funded by the EU Commission through the INEA Agency in the CEF Telecom framework, aims at creating an open dataset describing innovation and research organizations located in the Alpine area. The input of the process are open data sources, websites, portals, and public registries. The main challenges addressed in the project and the big data integration pipeline adopted will be described in the talk. In particular, the selection of the datasets related to a domain of interest is a critical task in all data integration approaches. In the talk, a tool for exploring the content of a dataset developed in the project will be demonstrated.
The project has been presented by Francesco Guerra – principal investigator – at the University of Rjieka (Croatia).