2nd International Workshop on Linked Science 2012—Tackling Big Data (LISC2012)
Workshop etherpad: http://epad.ifgi.de/p/lisc2012
When: November 12th, 2012
Where: Boston, USA
Collocated with the 11th International Semantic Web Conference (ISWC2012).
Submission Deadline Extended: August 4, 2012. 23:59pm Hawaii Time
July 31, 2012 (NOTE: hard deadline)
Notification Due (extended): Aug 23, 2012
Aug 21, 2012
Final Version Due: Sep 10, 2012
Workshop URI: http://linkedscience.org/events/lisc2012
Submissions via: https://www.easychair.org/conferences/?conf=lisc2012
Scientific communication has traditionally relied upon publications and presentations, with an estimate of millions of publications worldwide per year; the growth rate of PubMed alone is now 1 paper per minute. The results described in these articles are often backed by large amounts of diverse data produced by complex experiments, computer simulations, and observations of physical phenomena. Because of this avalanche of data, it is increasingly hard to validate, reproduce, reuse and leverage scientific data. In addition, although publications, methods and datasets are very related, they are not easily accessible and interlinked. The notable exception is omics research where journals require deposit of sequences in databanks as a condition of publication. Even where data is discoverable and accessible, significant challenges remain in data reuse and sharing, in facilitating the necessary correlation, integration and synthesis of data across levels of theory, techniques and disciplines.
In the 2nd International Workshop on Linked Science (LISC2012) we will discuss and present results of new ways of publishing, sharing and linking scientific data together, and reasoning over such data to discover interesting new links to validate research. The theme of this year’s workshop will focus on research addressing these issues with respect to big data. Big Data is loosely characterized by the size and/or number of individual files, the number of represented variables, a range of physical scales, a range of scientific disciplines, heterogeneous metadata and data formats, in short data that cannot easily be accessed and manipulated from a thumb-drive.
Making entities identifiable and referenceable using URIs augmented by semantic, scientifically relevant annotations greatly facilitates data discovery and access. This Linked Science approach, i.e., publishing, sharing and interlinking scientific resources and data, is of particular importance for scientific research, where sharing is crucial for facilitating reproducibility and collaboration within and across disciplines. This integrated process, however, has not been established yet. Bibliographic contents are still regarded as the main scientific product, and associated data, models and software are either not published at all, or published in separate places, often with no reference to the respective paper.
In the workshop we will discuss whether and how new emerging technologies (Linked Data, and semantic technologies more generally) can realize the vision of Linked Science. In particular, this year, we plan to focus on the theme of Tackling Big Data, soliciting contributions that discuss issues of analyzing, aggregating, and using the vast amount of data that scientists produce today. Both in the United States and in Europe, not only researchers, but also governments begin to realize the urgent need of analyzing and processing this data, with funding agencies and research institutions starting new initiatives. Our workshop will help catalyze the use of semantic technologies and linked-data approaches in solving the big-data challenge.
In the LISC2012 we will discuss and present results of new ways of publishing, sharing, linking, and analyzing such scientific resources motivated by driving scientific requirements, as well as reasoning over the data to discover interesting new links and scientific insights.
LISC2012 is a continuation of the 1st International Workshop on Linked Science 2011 (LISC2011), collocated with the 10th International Semantic Web Conference (ISWC2011) in Bonn. LISC2011 raised significant interest. It was the third largest workshop of ISWC2011 in terms of the number of participants (35 registered). The discussion was lively, and breakout sessions identified a research agenda for Linked Science. The participants asked for the continuation of the Linked Science workshop series, and LISC2012 is an answer to this call.
We invite two kinds of submissions:
- Research papers. These should not exceed 12 pages in length.
- Position papers. Novel ideas, experiments, and application visions from multiple disciplines and viewpoints are a key ingredient of the workshop. We therefore strongly encourage the submission of position papers. Position papers should not exceed 4 pages in length.
Submissions should be formatted according to the Lecture Notes in Computer Science guidelines for proceedings available at http://www.springer.com/computer/lncs?SGWID=0-164-7-72376-0. Papers should be submitted in PDF format. All submissions will be done electronically via the LISC2012 web submission system.
At least one author of each accepted paper must register for the workshop. All workshop participants have to register for the main conference, ISWC2012, as well.
TOPICS OF INTEREST:
In both categories, papers are expected in (but not restricted to) the following topics:
- Large scale data integration in the sciences
- Analysis of large scale linked data
- Formal representations of scientific data
- Ontologies for scientific information
- Reasoning mechanisms for linking scientific datasets
- Integration of quantitative and qualitative scientific information
- Ontology-based visualization of scientific data
- Semantic similarity in science applications
- Semantic integration of crowd sourced scientific data
- Key research life cycle challenges in enabling linked science and proposed solution strategies
- Interrelationship of existing traditional solutions and new linked science solutions
- Connecting scientific publications with underlying research datasets
- Provenance, quality, privacy and trust of scientific information
- Enrichment of scientific data through linking and data integration
- Semantically-driven data integration
- Formal encoding of scientific information
- Scalability of existing semantic and Linked Data solutions for managing scientific resources
- Support for data publishing for sharing and reuse
- Case studies on linked science, i.e., astronomy, biology, environmental and socio-economic impacts of global warming, statistics, environmental monitoring, cultural heritage, etc.
- Linked Data for dissemination and archiving of research results,for collaboration and research networks, and for research assessment
- Applications for research that build on top of Linked Data Legal, ethical and economic aspects of Linked Data in science
The Workshop Proceedings has been published online:
- Tomi Kauppinen, Line C. Pouchard, Carsten Keßler (Eds.): Proceedings of the Second International Workshop on Linked Science 2012–Tackling Big Data, CEUR Workshop proceedings Vol-951, In conjunction with the International Semantic Web Conference (ISWC2012). Boston, MA, USA, November 12, 2012.
- Bernardo Gonçalves, Fabio Porto and Ana Maria Moura. On the semantic engineering of scientific hypotheses as linked data
- Suppawong Tuarob, Jeffery S. Horsburgh, Natasha Noy, Giri Palanisamy and Line C. Pouchard. ONEMercury: Towards Automatic Annotation of Environmental Science Metadata
- Albert Meroño-Peñuela, Ashkan Ashkpour, Laurens Rietveld, Stefan Schlobach and Rinke Hoekstra. Linked Humanities Data: The Next Frontier?
- Christian Y. A. Brenninkmeijer, Chris Evelo, Carole Goble, Alasdair J. G. Gray, Paul Groth, Steve Pettifer, Robert Stevens, Antony Williams and Egon L. Willighagen. Scientific Lenses over Linked Data: An approach to support task specific views of the data. A vision.
- Jun Zhao, Graham Klyne, Piotr Holubowicz, Raul Palma, Stian Soiland-Reyes, Kristina Hettne, Jose Enrique Ruiz, Marco Roos, Kevin Page, Jose Manuel Gomez-Perez, David De Roure and Carole Goble. RO-Manager: A Tool for Creating and Manipulating Research Objects to Support Reproducibility and Reuse in Sciences
- Daniel Garijo and Yolanda Gil. Augmenting PROV with Plans in P-PLAN: Scientific Processes as Linked Data
- Nathan Wilson, Han Wang and Deborah McGuinness. Scientific Names and Descriptions for Organisms on the Semantic Web
- Kevin Page, Raúl Palma, Piotr Hołubowicz, Graham Klyne, Stian Soiland-Reyes, Don Cruickshank, Rafael González Cabero, Esteban García Cuesta, David De Roure, Jun Zhao and José Manuel Gómez-Pérez. From workflows to Research Objects: an architecture for preserving the semantics of science
- Yolanda Gil, Varun Ratnakar and Paul Hanson. Organic Data Publishing: A Novel Approach to Scientific Data Sharing
The workshop will take place on Monday, November 12, 2012 as a pre-conference workshop at ISWC 2012 in Boston, MA. There will be 30 minutes (including time for questions) for each full paper, and 15 minutes for each short paper (again, including time for questions). Full papers are marked with a * in the program. The room for the workshop has not been announced yet.
09:00 – 09:15: Intro
09:15 – 10:00: Keynote: Line Pouchard: Semantic Challenges and Opportunities in DataONE
10:00 – 10:30: Session 1: Scientific Hypotheses B. Gonçalves et al.: On the semantic engineering of scientific hypotheses as linked data *
10:30 – 11:00: Coffee break
11:00 – 11:30: Session 1 (ctd.) Y. Gil et al.: Organic Data Publishing: A Novel Approach to Scientific Data Sharing *
11:30 – 11:45: Session 2: Provenance/Research Objects J. Zhao et al.: RO-Manager: A Tool for Creating and Manipulating Research Objects to Support Reproducibility and Reuse in Sciences
11:45 – 12:00: D. Garijo & Y. Gil: Augmenting PROV with Plans in P-PLAN: Scientific Processes as Linked Data
12:00 – 12:15: N. Wilson et al.: Scientific Names and Descriptions for Organisms on the Semantic Web
12:15 – 12:30: K. Page et al.: From workflows to Research Objects: an architecture for preserving the semantics of science
12:30 – 14:00: Lunch
14:00 – 14:30: Session 3: Enriching Content with Semantics A. Meroño-Peñuela et al.: Linked Humanities Data: The Next Frontier? *
14:30 – 15:00: S. Tuarob et al.: ONEMercury: Towards Automatic Annotation of Environmental Science Metadata *
15:00 – 15:15: C. Brenninkmeijer et al.: Scientific Lenses over Linked Data: An approach to support task specific views of the data. A vision.
15:15 – 15:30 Deciding on topics for the break out session, forming groups
15:30 – 16:00 Coffee break
16:00 – 17:30: Break-out sessions and wrap-up
Paper Submission Deadline Extended: August 4th, 2012. 23:59pm Hawaii Time. July 31, 2012 Notification of acceptance or rejection: August 23rd (extended) Camera ready version due: September 10th
- Workshop day: November 12th, 2012
Tomi Kauppinen is a postdoctoral researcher in the Department of Media Technology at the Aalto University School of Science. He holds a PhD from the Aalto University, Finland with a thesis on reasoning about change and time. He co-chaired the First International Workshop on Linked Science 2011 at the International Semantic Web Conference (ISWC2011), the track on Interoperability and Semantics of the Geoinformatik 2011 conference, and led the breakout session for Vocabularies for Science at Science Online London 2011 organized by Nature. He is also an organizer of the Workshop on GIScience in the Big Data Age 2012 (GIBDA2012). His research focuses on spatiotemporal and semantic modeling of events and processes. His current projects include opening and linking of scientific and educational data in LinkedScience.org and in the Linked Open Aalto projects. You can find him on twitter: @LinkedScience
Line C. Pouchard is an Information Scientist in the Scientific Data Group at Oak Ridge National Laboratory, US Department of Energy. With Tomi Kauppinen and Carsten Keßler, she co-chaired the Linked Science 2011 at ISWC in Bonn. Her recent work includes implementing semantic technologies to improve observation data discovery in the Earth and Atmospheric Sciences for the NASA-sponsored ORNL DAAC (Distributed Active Archive Center for Biogeochemical Dynamics). Her long-term research interests have focused on ontologies and the implementation of frameworks for scientific applications of interest to the Departments of Energy and Defense. These interests have been applied to the scientific domains of climate and earth sciences, fusion, medical modeling, and homeland security. She is an active participant to several leading ORNL efforts contributing to other agencies, including the NSF-sponsored DataONE (Integration and Semantics Working-Group) and Remote Data Visualization and Analytics. DataONE (Data Observation Network for Earth) is developing infrastructure, strategies, and practices for decade-long sustainable data management, publication, archive, and curation services for the digital data supporting earth, environmental, and ecology research.
Carsten Keßler is a post-doc researcher at Institute for Geoinformatics (ifgi), University of Muenster, Germany, where he finished his PhD on context-aware semantics-based information retrieval in 2009. In ifgi’s semantic interoperability lab (MUSIL), he currently coordinates the Linked Open Data University of Muenster (LODUM) project and is a member of the LinkedScience.org team. He has co-chaired a number of workshops, including the LISC2011 workshop, and is a guest editor of the Semantic Web Journal special issue on Linked Data for science and education. Besides his activities at the university, Carsten is currently consulting the United Nations Office for the Coordination of Humanitarian Affairs (UN OCHA) in the development of the Humanitarian eXchange Language (HXL).
Paul Groth is an assistant professor in the Knowledge Representation and Reasoning Group at the VU University of Amsterdam. He holds a Ph.D. in Computer Science from the University of Southampton (2007) and has done research at the University of Southern California. His research focuses on mechanisms for enabling multi-institutional systems. This includes research in data provenance, scientific workflow and knowledge sharing with over 50 publications in these areas. Paul is co-chair of the W3C Provenance Working Group developing a standard for provenance interchange. Currently, he is a key contributor to Open Phacts (www.openphacts.org), a project to develop a provenance-enabled platform for pharmacological information. You can find him on twitter: @pgroth
Natasha Noy is a Senior Research Scientist at Stanford Medical Informatics. She is a principal member of the Protégé group, where she works on tools for ontology management, including versioning, mapping, and modularization of ontologies. She is currently involved in the design of the next-generation Protégé system that will support collaborative development of ontologies. Natasha is also affiliated with the National Center for Biomedical Ontologies, where she works on community-based approaches to ontology evaluation, review, and alignment.
Eric G. Stephan works for the U.S. Department of Energy Pacific Northwest National Laboratory in Richland, Washington and has been actively engaged in advancing scientific database, geospatial, metadata, and provenance capabilities to support experimental and computational scientists and production systems.Research interests include data intensive computing, semantic web, and constructing analytical pipelines to harmonize and explore heterogeneous data/knowledge resources.
Jun Zhao is an EPSRC Postdoctoral Fellow at the Life Science Interface in the University of Oxford. Her current research interests are provenance, trust of data, Semantic Web applications for integrating biological data resources, and provenance-based information quality assessment. She has been the provenance lead in the UK data.gov.uk project and the EU Wf4Ever project. She has been leading organizer and invited speaker of many national and international workshops.
- Sean Bechhofer, University of Manchester, UK
- Chris Bizer, Free University of Berlin, Germany
- Björn Brembs, Free University of Berlin, Germany
- Boyan Brodaric, Natural Resources Canada, Canada
- Paolo Ciccarese, Harvard University, USA
- Tim Clark, Harvard University, USA
- Stefan Dietze, L3S Research Center, Germany
- Ying Ding, Indiana University, USA
- Michel Dumontier, Carleton University, Canada
- Peter Fox, Rensselaer Polytechnic Institute, USA
- Yolanda Gil, ISI, USA
- Pieter Van Gorp, Eindhoven University of Technology, The Netherlands
- Michael Huhns, University of South Carolina, USA
- Krzysztof Janowicz, University of California, Santa Barbara, USA
- Werner Kuhn, University of Muenster, Germany
- Christopher Lynnes, NASA, USA
- Zoltán Miklós, École Polytechnique Fédérale de Lausanne, Switzerland
- Mark Schildhauer, University of Califormia, Santa Barbara, USA
- Herbert Van de Sompel, Los Alamos National Laboratory, USA
- Jie Tang, Tsinghua University, China
- Anita de Waard, Elsevier Labs
- Stephen Wan, CSIRO, Australia
- Nancy Wiegand, University of Wisconsin
- Amrapali Zaveri, University of Leipzig, Germany