Workshop on GIScience in the Big Data Age 2012

Workshop on GIScience in the Big Data Age 2012 (GIBDA2012) will be organized in conjunction with the seventh International Conference on Geographic Information Science 2012 (GIScience 2012) in Columbus, Ohio, USA on September 18th, 2012.

Workshop Description and Scope

The rapidly increasing information universe with new data created at a speed surpassing our capacities to store it, calls for improved methods to retrieve, filter, integrate, and share data. The vision of a data-intensive science hopes that the open availability of data with a higher spatial, temporal, and thematic resolution will enable us to better address complex scientific and social questions. However, on the downside, understanding, sharing, and reusing these data becomes more challenging. Big Data is not only big because it involves a huge amount of data, but also because of the high-dimensionality and inter-linkage of these data sets. The on-the-fly integration of heterogeneous data from various sources has been named one of the frontiers of Digital Earth research, Bioinformatics, the Digital Humanities, and other emerging research visions.

From a more technical perspective, a knowledge infrastructure is required to handle Big Data. Currently, the most promising approach is the Linked Data cloud. While the Web has changed with the advent of the Social Web from mostly authoritative towards increasing amounts of user-generated content, it is essentially still about linked documents. These documents provide structure and context for the described data and easy their interpretation. In contrast, the upcoming Data Web is about linking data, not documents. Such data sets are not bound to a specific document but can be easily combined and used outside of the original context. With a growth rate of millions of new facts encoded as RDF-triples per month, the Linked Data cloud allows users to answer complex queries spanning multiple sources. Due to the uncoupling of data from its original creation context, semantic interoperability, identity resolution, and ontologies are central methodologies to ensure consistency and meaningful results.

Space and time are fundamental ordering relations to structure such data and provide an implicit context for their interpretation. Prominent geo-related Linked Data hubs include Geonames.org as well as the Linked Geo Data project, which provides a RDF serialization of Open Street Map. Furthermore, many other Linked Data sources contain location references, e.g., observation data provided by sensors.

This full day workshop is a follow-up event of the successful first workshop on Linked Spatiotemporal Data at GIScience 2010. While this first workshop was centered around Linked Data and geo-ontologies, the GiBDA 2012 workshop takes a broader perspective by highlighting data-intensive science as the research vision and Linked Data as a promising knowledge infrastructure. We hope that the workshop will help better define the data, knowledge representations, infrastructure, reasoning methodologies, and tools needed to link and query massive data based on their spatial and temporal characteristics.

List of Relevant Topics

Topics of interest for the Linked Spatiotemporal Data workshop include (but are not limited to):

  • Mining Big Data

    • Learning geo-ontologies out of massive data
    • Abduction-based frameworks and systems
    • Mining Location-based Social Networks
    • Studying the geo-indicativeness of massive, semi-structured data
    • Analogy-based search in Big Data
    • Semantic heterogeneity and ontology alignment
    • Semantics-enabled geo-statistics
  • Retrieving and browsing of Linked Spatiotemporal Data

    • Learning Linked Spatiotemporal Data from existing sources
    • Spatiotemporal indexing of Linked Data
    • Harvesting Linked Data from heterogeneous sources
    • Spatial extensions to query languages (e.g., GeoSPARQL)
    • Visualizing and browsing through Linked Spatiotemporal Data
  • Big Data and Volunteered Geographic Information (VGI)

    • Spatiotemporal aspects of data quality, trust, and provenance
    • Tag and vocabulary recommendations for annotating VGI
    • Maintenance of outgoing links
  • Application of Linked Spatiotemporal Data

    • Linked Data and Sensor Web Enablement (SWE)
    • Linked Data and mobile applications
    • Linked Data gazetteers and Points Of Interest
    • Linked Data in the domain of cultural heritage research
  • Integration and Interoperation of Linked Spatiotemporal Data

    • Ontologies and vocabularies to support interoperability
    • Geo-Ontology Design Patterns
    • Identity assumptions and resolution for data fusion and integration
    • The role of space and time to structure Linked Data
    • Versioning of spatiotemporal data.
    • Semantic annotation and Microformats
    • Adding contextual information to Linked Data

Workshop Format and Structure

The full day workshop will focus on intensive discussions setting a roadmap towards publishing, structuring, retrieving, and consuming Linked Spatiotemporal Data and understanding how GIScience can contribute to the vision of a data-intensive science. The workshop will accept three kinds of contributions, full research papers presenting new work in the indicated areas, statements of interest, and data challenge papers. While the research papers will be selected based on the review results adhering to classical scientific quality criteria, the statements of interest should raise questions, present visions, and point to the open gaps. However, statements of interest will also be reviewed to ensure quality and clarity of the presented ideas.

We also welcome demonstrations of existing tools, applications, and geo-ontologies. Details for the data challenge are given below. The presentation time per speaker will be restricted to 5 minutes for statements of interest and 10 minutes for full papers. Based on the presented work, all workshop participants will decide on 2–3 research topics to be discussed in breakout groups. In a final session, the breakout groups will present their findings on research topics and challenges and try to integrate them across the discussed topics.

Submissions and Proceedings

All presented papers will be made available through the workshop Web-page, the electronic conference proceedings of GIScience 2012, as well as via CEUR-WS. Full research papers should be approximately 7-10 pages, while statements of interest and data challenge papers should be between 5-6 pages. Selected papers may be considered for a fast-track submission to the Semantic Web journal by IOS Press.

Please upload your submission using the workshop’s EasyChair web-page.

Data Challenge

The website spatial.linkedscience.org/ contains a growing collection of metadata for proceedings of conferences on topics related to geographic information science. So far, it contains most of the metadata for the GIScience, COSIT, ACM GIS, and AGILE conference series. Within the GIBDA Data Challenge, we are looking for

  • innovative analyses of the data
  • interactive visualizations
  • approaches for cleaning the data up
  • pattern and topic mining
  • enrichment and interlinking with other datasets (e.g., from the Linked Data cloud)
  • insights into GIScience as research field
  • adding social roles and aspects

The raw data can be queried via SPARQL using the SPARQL endpoint  spatial.linkedscience.org/sparql. Submissions to the data challenge are to be submitted through EasyChair as a brief description of the entry, along with a link to the demo/analysis/dataset. Entries to the challenge will be evaluated by the program committee based on innovativeness and potential impact. The winner will be awarded a $250 price and will present at the workshop.

Important Dates

  • Submission due: 18. June 2012
  • Acceptance Notification: 6. July 2012
  • Camera-ready Copies: 16. July 2012

Organizers

Programme Committee

  • TBA

Related Activities

Please feel free to contact the organizers for further questions at jano @ geog . ucsb. edu.

 

Posted in Announcement, Linked Science, Linked Spatiotemporal Data, SPARQL, Spatiotemporal Data Handling, Spatiotemporal Information, Workshop | Tagged , , , , , , , | Leave a comment

Tutorial on SPARQL Package for R

Tools are major enablers of Linked Science. One crucial aspect is how to access and analyze data, and especially how to get only that part of data which is of interest for a given research question.  Linked Data solves the access part, and SPARQL allows to query only a subset of the data. For statistical computing there are tools like R. As a solution to bridge the two communities, those of statistical computing and  semantic web, there is now a SPARQL Package for R, which enables to get data from Linked Data services to R for analysis. At LinkedScience.org you may now find a tutorial on SPARQL Package for R. By following it you can learn:

  1. how to access and query Linked Spatiotemporal Data of the  deforestation statistics related to the Brazilian Amazon Rainforest, and
  2. how to analyze it within R (which is a free software environment for statistical computing).

Posted in Announcement, Brazilian Amazon Rainforest, GIS, Linked Science, Linked Spatiotemporal Data, Maps, Multidisciplinary Research, R, SPARQL, Statistical Computing, Tutorial | Tagged , , , , , , , , , , | Leave a comment

First course worldwide on Linked Science at the University of Muenster, Germany

First course worldwide on Linked Science will be held at the Institute for Geoinformatics at the University of Muenster, Germany during the winter semester 2011/2012. This Linked Science course is arranged as a seminar, and the title is “Spatiotemporal and Semantic Modeling for Linked Science” and it is lectured by Dr. Tomi Kauppinen.

The seminar teaches both basics and shows recent advancements of Linked Spatiotemporal Data and Linked Science. The seminar is combined of lectures and demo-sessions showing through examples how spatiotemporal information can be modeled, semantically described and published as Linked Data. The major emphasis is on scientific datasets.

In the course we will also discuss spatiotemporal and semantic reasoning techniques to enrich the data. Students will also  learn how the data can be connected with the help of the SPARQL package for R for statistical analysis, and how and which visualization techniques and tools are available for interacting with the data. Each student will choose a topic for the seminar to create and use Linked Scientific Data of some discipline (e.g.  life sciences, natural and living environment studies, chemistry, biology, crisis management, history and cultural heritage) within the University of Muenster.

The major emphasis is in disciplines where there are interesting spatiotemporal aspects. The results of these student works will be shown and discussed in the demo sessions. The course serves both newcomers in Linked Data techniques and advanced students already knowing the basics and wanting to learn the Linked Science approach. Students will learn theory, techniques, presentation and organizational skills in the seminar.

Posted in Announcement, Large Geospatial Applications, Linked Science, Multidisciplinary Research, Spatiotemporal Data Handling, Spatiotemporal Information | Tagged , , , , , , , | Leave a comment

Linked Science 2011 program announced

The program of the First International Workshop on Linked Science (LISC2011) collocated with ISWC 2011 (to be held on October 24th in Bonn, Germany) is now announced:

9AM – 9:45 AM Session 1: Opening

  • Opening Remarks. 15’
  • Invited Talk: Damian Gessler  30’ “Semantic Web for Science: Lessons of the iPlant Collaborative and SSWAP”

9:45 – 10:30 AM Session 2: Applications—-Successes and Challenges

  • Linked Data for Network Science. Paul Groth and Yolanda Gil.  (30’ – 12 pages)
  • Linking the Outcomes of Scientific Research: Requirements from the Perspective of Geosciences . Stephan Mäs, Matthias Müller, Christin Henzen and Lars Bernard. (15’ – 6 pages) 

10:30 AM Coffee break

11 AM – 11:45 AM Session 3: Semantic Integration

  • Interactively Mapping Data Sources into the Semantic Web. Craig A. Knoblock, Pedro Szekely, Jose Luis Ambite, Shubham Gupta, Aman Goel, Maria Muslea, Kristina Lerman and Parag Mallick. (30’ – 12 pages)
  • Similarity between semantic description sets: addressing needs beyond data integration. Todd Vision, Hilmar Lapp, Paula Mabee, Monte Westerfield and Judith Blake. (15’ – 4 pages)

11:45 – 12:30 PM Session 4: Collaborations and languages

  • Supporting Scientific Collaboration Through Class-Based Object Versioning.  Johnson Mwebaze. (30’ – 12 pages)
  • Glottolog/Langdoc: Defining dialects, languages, and language families as collections of resources. Sebastian Nordhoff and Harald Hammarström. (15’ – 6 pages)

12:30 -2 PM Lunch

2 PM – 2:30 PM Session 5: Sources

  • The knowledge-driven exploration of integrated biomedical knowledge sources facilitates the generation of new hypotheses.  Vinh Nguyen, Olivier Bodenreider, Todd Mining and Amit Sheth. (15’ – 5 pages)
  • Where did you hear that? Information and the Sources They Come From?   Jim Mccusker, Timothy Lebo, Li Ding, Cynthia Chang, Paulo Pinheiro Da Silva and Deborah L. Mcguinness. (15’ – 5 pages)

2:30 – 3 PM  Discussion about topics for break-out groups; group organization

3 – 4 PM Break-out sessions

4 PM Coffee break 

4:30 PM  Results of the break-out sessions

The program is also available at the LISC 2011 event pages.

Posted in Announcement, Linked Science, Workshop | Tagged | Leave a comment

Linked Science 2011 proceedings published

 

The proceedings of LISC2011 have now been published as CEUR Workshop Proceedings:

Tomi Kauppinen, Line C. Pouchard, Carsten Keßler (Eds.): Proceedings of the First International Workshop on Linked Science 2011, Bonn, Germany, October 24, 2011. CEUR Workshop Proceedings, Volume 783, available at http://www.ceur-ws.org/Vol-783.

Posted in Announcement, Linked Science, Workshop | Tagged | Leave a comment

Spatial.LinkedScience.org opened!

During the last few days and weeks we (Krzysztof Janowicz, Carsten Keßler, Alexander Savelyev and Tomi Kauppinen) created a Linked Data set about the people, papers and proceedings  of the COSIT (Conference on Spatial Information Theory) series.

The result is now opened and can be interacted with at Spatial.LinkedScience.org! We look forward to maintain the portal as a community effort in order to serve back the community, i.e. the researchers of the Spatial Information and Geographic Information Science. 

Would other communities be interested in joining LinkedScience.org? Just contact us, and let us plan it.

Posted in Announcement, Linked Science | Tagged , , , , | Leave a comment

List of accepted papers to LISC2011 announced

The following eight papers have been accepted to be presented at the 1st International Workshop on Linked Science (LISC2011), Oct 24th, 2011 in Bonn, Germany. We received 16 submissions by the deadline—thus  the acceptance rate was 50%. 

  • Linked Data for Network Science
    Paul Groth and Yolanda Gil. 
  • The knowledge-driven exploration of integrated biomedical knowledge sources facilitates the generation of new hypotheses
    Vinh Nguyen, Olivier Bodenreider, Todd Mining and Amit Sheth. 
  • Glottolog/Langdoc: Defining dialects, languages, and language families as collections of resources
    Sebastian Nordhoff and Harald Hammarström. 
  • Linking the Outcomes of Scientific Research: Requirements from the Perspective of Geosciences
    Stephan Mäs, Matthias Müller, Christin Henzen and Lars Bernard. 
  • Supporting Scientific Collaboration Through Class-Based Object Versioning
    Johnson Mwebaze, Danny Boxhoorn and Edwin Valentijn. 
  • Similarity between semantic description sets: addressing needs beyond data integration
    Todd Vision, Hilmar Lapp, Paula Mabee, Monte Westerfield and Judith Blake.
  • Interactively Mapping Data Sources into the Semantic Web
    Craig A. Knoblock, Pedro Szekely, Jose Luis Ambite, Shubham Gupta, Aman Goel, Maria Muslea, Kristina Lerman and Parag Mallick. 
  • Where did you hear that? Information and the Sources They Come From
    Jim Mccusker, Timothy Lebo, Li Ding, Cynthia Chang, Paulo Pinheiro Da Silva and Deborah L. Mcguinness. 

The detailed program will be announced soon at linkedscience.org/events/lisc2011.

Posted in Announcement, Linked Science, Workshop | Tagged | Leave a comment

How should a science schema look like?

Tomi Kauppinen and Alkyoni Baglatzi ran a breakout session at Science Online London 2011 with the question “Can we develop something like schema.org to encourage data sharing and reuse?”. This story combines the preparation of the session, presentation given at the session and results. Follow @LinkedScience to hear how the results gets implemented and published as a science schema. Presentation and results of this science schema breakout session now available at:

http://storify.com/linkedscience/open-science

Posted in Announcement, Linked Science | Tagged , , , | Leave a comment

Linked Science Core vocabulary online

The Linked Science Core vocabulary is designed for describing a research setting and to interconnect it to other related things and components (researcher, data, hypothesis, etc.).

You may check LSC online at http://linkedscience.org/lsc/ns/. There will be a breakout session at the Science Online London 2011 (#solo11) to develop a schema or a vocabulary for science, and we will use LSC as a basis for stimulating the imagination for doing so.

We hope to use the results of the Science Online London to extend and improve LSC. Stay tuned and follow our twitter feed @LinkedScience—-or participate our breakout session at Science Online London to develop new ideas for describing and publishing scientific content online!

Posted in Announcement, Linked Science | Tagged , , , | Leave a comment

CFP: 1st International Workshop on Linked Science 2011 (LISC2011)

CALL FOR PAPERS
1st International Workshop on Linked Science 2011 (LISC2011)
Collocated with the 10th International Semantic Web Conference (ISWC2011)
October 24th, 2011
Bonn, Germany

Workshop URI: http://linkedscience.org/events/lisc2011

OBJECTIVES

Scientific efforts are traditionally published only as articles, with an estimate of millions of publications worldwide per year; the growth rate of PubMed alone is now 1 paper per minute. The validation of scientific results requires reproducible methods, which can only be achieved if the same data, processes, and algorithms as those used in the original experiments were available. However, the problem is that although publications, methods and datasets are very related, they are not always openly accessible and interlinked. Even where data is discoverable, accessible and assessable, significant challenges remain in the reuse of the data, in particular facilitating the necessary correlation, integration and synthesis of data across levels of theory, techniques and disciplines. In the LISC 2011 (1st International Workshop on Linked Science) we will discuss and present results of new ways of publishing, sharing, linking, and analyzing such scientific resources motivated by driving scientific requirements, as well as reasoning over the data to discover interesting new links and scientific insights.

Making entities identifiable and referenceable using URIs augmented by semantic, scientifically relevant annotations greatly facilitates access and retrieval for data which used to be hardly accessible. This Linked Science approach, i.e., publishing, sharing and interlinking scientific resources and data, is of particular importance for scientific research, where sharing is crucial for facilitating reproducibility and collaboration within and across disciplines. This integrated process, however, has not been established yet. Bibliographic contents are still regarded as the main scientific product, and associated data, models and software are either not published at all, or published in separate places, often with no reference to the respective paper.

In the workshop we will discuss whether and how new emerging technologies (Linked Data, and semantic technologies more generally) can realize the vision of Linked Science. We see that this depends on their enabling capability throughout the research process, leading up to extended publications and data sharing environments. Our workshop aims to address challenges related to enabling the easy creation of data bundles—data, processes, tools, provenance and annotation—supporting both publication and reuse of the data. Secondly, we look for tools and methods for the easy correlation, integration and synthesis of shared data. This problem is often found in many disciplines (including astronomy, biology, geosciences, cultural heritage, earth, climate, environmental and ecological sciences and impacts etc.), as they need to span techniques, levels of theory, scales, and disciplines. With the advent of Linked Science, it is timely and crucial to address these identified research challenges through both practical and formal approaches.

SUBMISSIONS

We invite two kinds of submissions:
- Research papers. These should not exceed 15 pages in length.
- Position papers. Novel ideas, experiments, and application visions from multiple disciplines and viewpoints are a key ingredient of the workshop. We therefore strongly encourage the submission of position papers. Position papers should not exceed 5 pages in length.

Submissions should be formatted according to the Lecture Notes in Computer
Science guidelines for proceedings available at http://www.springer.com/computer/lncs?SGWID=0-164-7-72376-0. Papers should be submitted in PDF format. All submissions will be done electronically via the LISC2011 web submission system.

At least one author of each accepted paper must register for the workshop.
Information about registration will appear soon on the ISCW2011 Web pages.

TOPICS OF INTEREST

In both categories, papers are expected in (but not restricted to) the following topics:

- Key research life cycle challenges in enabling linked science and proposed solution strategies
- Interrelationship of existing traditional solutions and new linked science solutions
- Formal representations of scientific data
- Ontologies for scientific information
- Reasoning mechanisms for linking scientific datasets
- Integration of quantitative and qualitative scientific information
- Ontology-based visualization of scientific data
- Semantic similarity in science applications
- Semantic integration of crowd sourced scientific data
- Connecting scientific publications with underlying research datasets
- Provenance, quality, privacy and trust of scientific information
- Enrichment of scientific data through linking and data integration
- Semantic driven data integration
- Support for data publishing for sharing and reuse
- Case studies on linked science, i.e., astronomy, biology, environmental and socio-economic impacts of global warming, statistics, environmental monitoring, cultural heritage, etc.
- Barriers to the acceptance of linked science solutions and strategies to address these
- Linked Data for
- dissemination and archiving of research results
- collaboration and research networks
- research assessment
- Applications for research that build on top of Linked Data
- Legal, ethical and economic aspects of Linked Data in science

PROCEEDINGS

We expect the workshop proceedings to be published as CEUR Workshop
Proceedings (see http://ceur-ws.org).

IMPORTANT DATES

- Paper submission deadline: August 15
- Notification of acceptance or rejection: September 5
- Camera ready version due: September 16

WORKSHOP CHAIRS

- Tomi Kauppinen, University of Muenster, Germany
- Line C. Pouchard, Oak Ridge National Laboratory, USA

ORGANIZING COMMITTEE

- Mathieu d’Aquin, Open University, UK
- Frank van Harmelen, Vrije Universiteit Amsterdam, The Netherlands
- Carsten Keßler, University of Muenster, Germany
- Kerstin Kleese-Van Dam, Pacific Northwest National Laboratory, USA
- Eric G. Stephan, Pacific Northwest National Laboratory, USA
- Jun Zhao, University of Oxford, UK

PROGRAMME COMMITTEE

- Sören Auer, University of Leipzig, Germany
- V. Balaji, Princeton University and NOAA/GFDL, USA
- Luis Bermudez, Open Geospatial Consortium, USA
- Benno Blumenthal, Columbia University, USA
- Chris Bizer, Free University of Berlin, Germany
- Tim Clark, Harvard University, USA
- Philippe Cudre-Mauroux, University of Fribourg, Switzerland
- Anusuriya Devaraju, University of Münster, Germany
- Stefan Dietze, L3S Research Center, Germany
- Kai Eckert, Mannheim University Library, Germany
- Peter Fox, Rensselaer Polytechnic Institute, USA
- Auroop Ganguly, Oak Ridge National Laboratory and University of Tennessee, Knoxville, USA
- Damian Gessler, U. of Arizona, USA
- Paul Groth, VU University Amsterdam, The Netherlands
- John Harney, Oak Ridge National Laboratory, USA
- Laura Hollink, TU Delft, The Netherlands
- Maria Indrawan, Monash University, Australia
- Antoine Isaac, Europeana, The Netherlands
- Krzysztof Janowicz, Pennsylvania State University, USA
- Matt Jones, UC Santa-Barbara, USA
- Werner Kuhn, University of Münster, Germany
- Chris Lynnes, NASA, USA
- Deborah L. McGuinness, Rensselaer Polytechnic Institute, USA
- Jim Myers, Rensselaer Polytechnic Institute, USA
- Paulo Pinheiro da Silva, University of Texas El Paso, USA
- Martin Raubal, ETH Zürich, Switzerland
- Mark Schildhauer, UC Santa-Barbara, USA
- Anita de Waard, Elsevier Labs

Posted in Announcement, Multidisciplinary Research | Tagged , , , | Leave a comment