Link: University of Iowa

Why CTSAsearch?

Research profiling systems have achieved notable adoption by research institutions. The ability to present a consistent public-facing presentation of research expertise not only improves the visibility of an institution's researchers, it also provides a significant foundation for enabling team science through discovery of potential collaborators.

One of the challenges presented to deployment of these systems is the resulting 'stovepipe effect' of each institution publishing data only about their researchers, even when those researchers collaborate with many other researchers at many other institutions. In the same way that search engines provide a single point of discovery for users of the World Wide Web, CTSAsearch provides a single point of discovery and visualization targeted specifically at scholarly researchers, their research, and their collaborators. CTSAsearch can therefore provide a number of features that build upon the data in research profiles, but take a user's overall experience to a higher level:

  1. A common conceptual terminology (currently the Unified Medical Language System, UMLS, from NIH's National Library of Medicine) supports matching of queries to profiles even when a query uses a synonym of a term found in a profile. Users can zoom in and out conceptually when a given query proves to generic or specific.
  2. A search hit in CTSAsearch points directly back to that researcher's page at their home institution, so the information presented is as fresh as possible.
  3. Visualization of a research community spans multiple institutions, allowing information seekers to see key players in a field without the need to manually search and corrolate publication data.
  4. State-of-the-art social network analysis techniques reveal community relationships that span inter-institutional boundaries.
  5. Inter-institutional analytics emerge easily through aggregation of profile data.
The end result is a significant step forward in viewing the landscape of research profiling installations as a single integrated ecosystem. For example, projects such as UCSF's Crosslinks are using CTSAsearch's open web services to populate their local system with links to external coauthors. Participation in CTSAsearch is open to any organization or group publishing openly available research data. Click here for more information

How does it work?

CTSAsearch is a federated search engine using VIVO ontology compliant Linked Open Data published by 87 institutions using a number of open source and commercial research profiling systems. These data are harvested periodically using multiple methods:

  1. Querying a SPARQL endpoint, when available;
  2. Crawling publicly visible RDF pages, when supported (e.g., by VIVO);
  3. Querying an API, when available (e.g., by Elsevier Pure); and
  4. Crawling publicly visible HTML pages, and extracting data, when necessary.
Once harvested, all data, irrespective of source, are merged and normalized into a single integrated collection of profiles. This integrated collection is then scanned for research concepts and a search index is built. Search results can be viewed either in a traditional tabular form or using a range of visualization techiques based upon relationships identified between profiles.