DC2005和ECDL2005- –

DC2005和ECDL2005- –

今年DC元数据年会与欧洲数字图书馆会议ECDL将于9月中下旬相继召开,前者在西班牙首都马德里,后者在音乐之都维也纳召开。会议征文通知预示着这一领域人们关心的论题的细微变化。
对于DC来说,今年是其十周年,应该有一定的庆祝意义。
如果我要投稿,利用FOAF建设人名规范档是一个很好的题目,结合数字图书馆分布式体系结构、元数据应用、词表规范控制等,有理论,有实践。
DC2005的官方网站:http://dc2005.uc3m.es/
ECDL2005的官方网站:http://www.ecdl2005.org/

DC2005 CFP

Metadata based on standards such as Dublin Core are a key component of information environments from scientific repositories to corporate intranets and from business and publishing to education and e-government.

DC-2005 – the fifth in a series of conferences previously held in Tokyo (2001), Florence (2002), Seattle (2003), and Shanghai (2004) – will examine the practicalities of maintaining and using controlled sets of terms (“vocabularies”) in the context of the Web.

DC-2005 aims at bringing together several distinct communities of vocabulary users:

These diverse communities share common problems, from the the use of identifiers for terms to practices for developing, maintaining, versioning, translating, and adapting standard vocabularies for specific local needs. Topics of particular relevance include:

The Program Committee would like to solicit contributions of the following types:

Paper submissions will be peer-reviewed by the program committee and published both in print and electronically in the conference proceedings. All accepted papers must be presented at the conference by at least one of their authors.

The official language of the conference is English, but we will provide simultaneous translation (English-Spanish) for keynotes, tutorials, and plenary sessions.


Technorati :

论文的“领域知识”- –

最近的学习把握不同领域的兴趣都结合到一起了。 Metasearch 、资源集合的研究、知识本体对于异构信息系统互操作的作用、知识组织与元数据、规范档作用、数字图书馆体系结构等等,这些方面都可以融为一体。然而大脑中的结构时隐时现,还不确定。需要有一个”本体”表达出这样一种”大一统”的结构。

今天看看 UC Santa Barbara 的 ADL 计划。这个数字图书馆项目对于资源整合研究的非常彻底深入,是我学位论文的一个很好的参考原型。

对于 DLI1 早期的基本情况,可以参考 Dli Mag 的特辑: http://www.dlib.org/dlib/july96/07contents.html 。新的情况、各项目的互相影响及成果的后期应用还没有查到专门的总结。我比较关注的是这些项目在解决互操作问题、元数据应用、资源集合描述方面有些什么具体成果和共同结论,这些项目的技术解决方案在目前看起来虽然先进,然而很混乱,都是试验性的,不够简单,不足以获得大规模、广泛的应用。

DLI2 中陈炘钧就领导了一个项目: High-Performance Digital Library Classification Systems: From Information Retrieval to Knowledge Managemen ,在陈炘钧”人工智能实验室”的网页中这个项目变成了该实验室对于” Digital Libraries “研究的总体介绍,包括从 1996 年亚利桑那大学从加州圣塔巴巴拉大学亚历山大数字图书馆项目分得的一个子项目(只有 5 万美元)一直到 2004 年 9 月结题的 NSF 项目。粗粗算来,陈教授在数字图书馆的研究方面拿到了奖金 400 万美元的经费。

在该网页中陈炘钧这样陈述他的研究目标:

To develop techniques to enhance information retrieval and knowledge management of large digital collections. Our work includes portal building initiatives in a wide variety of domains and in multiple languages testing collection building, search, visualization, and analysis techniques.

从研究项目来看,多集中用于生物信息学、教育和计算机文献的数字图书馆方面。

技术研发领域主要有如下一些:


Technorati :

JCDL-NKOS Workshop 2005- –

今年 IEEE 和 ACM 联合的 JCDL 将于 6 月中旬在丹佛召开, NKOS 又有一个 Joint Session 。值得一提的是曾蕾和秦健是这个组织的核心成员。下面是他们的工作组计划,也是一个 Announcement ,可以看看今年的动向是什么。

Integration Challenges and Strategies

The 7th Networked Knowledge Organization Systems (NKOS) Workshop

Gail Hodge

Linda Hill

Jian Qin

Douglas Tudhope

Marcia Zeng

ABSTRACT

This year's Networked Knowledge Organization Systems (NKOS) workshop builds on seven years of workshops in the U.S. and Europe on issues enabling networked knowledge organization systems (KOS), such as classification systems, thesauri, gazetteers, taxonomies, and ontologies, to support the description, retrieval, and use of diverse information resources. Now many efforts are underway to research the issues and implement solutions to the challenges of networking and integrating KOS somewhat isolated domains: indexing services and thesaurus builders; computer scientists and system integrators; ontologists; taxonomists; and others. Requirements to solve these integration issues have become mission critical in many cases; the need to support computational, programmatic integration to handle masses of data from independent sources is pushing the research and development agenda. The need to move forward to meet these challenges while at the same time applying the best practices and “wisdom” developed through years of practical experience is acute.

The JCDL-NKOS workshop for 2005 will bring together researchers and implementers from diverse international communities who are developing new models, conducting research, and implementing practical solutions for networking KOS and integrating the associated information and data resources.

Topics may include:

Keywords

Controlled Vocabularies, Thesauri, Topic Maps, Ontologies, Networked Knowledge Organization Systems

AIM AND OBJECTIVES

The primary aim of the workshop is to inform NKOS researchers and practitioners about developments across a number of communities and to identify research and development directions. The objectives are to encourage sharing of new initiatives and lessons learned and to identify collaborative development opportunities.

PROGRAM

Session 1: Welcome and introduction to NKOS and to the workshop

Session 2: Self-introductions and brief descriptions of projects and interests particularly as they relate to the topic of integration and interoperability

Session 3: Case studies about the interoperability and integration of KOSs (approximately 2 presentations)

Session 4 : Presentations on methodologies, tools and strategies (approximately 2)

Session 5 : Presentations on recent related standards activities (approximately 2)

Session 6: Open discussion of the issues raised in the previous sessions

Session 7: Identification of a research agenda based on reactions from a panel of practitioners, software developers, academics and standards developers

WORKSHOP FORMAT

The full-day workshop will include invited and accepted presentations, guided by a program committee. Presentations will be grouped into topic sessions and demonstrations can be set up for access before and after the workshop and during breaks. Discussion and identification of issues will be encouraged by providing a significant amount of time for open discussion and networking opportunities. Participants will be given the opportunity to introduce their work and their interests. The identification of a research agenda will involve significant facilitated discussions.

ATTENDEES

The workshop will be announced via the NKOS listserv, which now has over 100 members from more than 10 countries. In addition, other relevant listservs and groups, such as ASIST-L, ASIST SIG-CR, ECDL, Dig-Lib, ASI, and standards-related groups will be notified. Approximately 25-35 participants are expected, though the organizers would prefer not to limit attendance. If necessary, the participants will be accepted in the order of registration. As in the past, the organizers will work with the JCDL organizers to coordinate logistics and monitor registrations.

WORKSHOP ORGANIZERS (略)

PREVIOUS NKOS ACTIVITIES

This is the 7th in the series of NKOS workshops held in conjunction with the JCDL. Past topics have included protocols for networked KOS, requirements for electronic thesauri, digital gazetteers, and moving traditional KOS into semantic web concepts. Attendance has ranged from 15 to over 50 people. NKOS is an ad hoc working group of academics and practitioners interested in various knowledge organization systems in networked environments. The listserv, hosted by NSF DL2, includes over 100 professionals from more than 10 countries. The NKOS Web site ( http://nkos.slis.kent.edu ) is hosted by Kent State Univ. NKOS-related sessions have been held at ECDL. The ECDL and JCDL NKOS sessions have resulted in proceedings which are available from the NKOS Web site, and in special issues of the Journal of Digital Information.


Technorati : , ,

Meta-search: SRW/U会成为NISO Metasearch的标准?- –

1、 Dlib Mag 刊出了一篇 SRW/U 的文章,把 SRW/U 与 OAI 的 protocol 进行对比,并提出了兼用这两种协议的方法。使我想到应该在”知识组织”课程里介绍这两种协议,同时介绍他们的相容性。

2、 SRW/U 作为 Z39.50 的 Web/XML 版本,有彻底的脱胎换骨,实际上 Z ( Zing )的功能被一系列新的协议所取代,而不是仅仅一个 SRW/U 。可参考网页: http://www.loc.gov/z3950/agency/zing/

3、 也是在本期 dlib 杂志上看到一个 OLAC 项目的元数据方案,采用比较规范的 DC Profile 形式在网上公开,可以供我们的项目参考。
http://www.language-archives.org/OLAC/metadata.html
http://www.language-archives.org/REC/olac-extensions.html

4、 NISO 的 Metasearch Initiative ( http://www.niso.org/committees/MetaSearch-info.html )与 Zing 到底是一种什么关系?可能 NISO 希望 Zing 的开发可以作为下一代 MetaSearch 的标准吧。

2005/2/23补记:

看到年心搏客( http://hjn66.blogchina.com/ )里对元搜索的一种区分,好像有点道理,不知是不是国内的普遍认识?

2. 整合检索:将各个数据库的元数据套录出来组成新的二次文献库,对源文件进行链接管理,这种方式技术难度大,需要数据厂商的支持。 TRS 就是这个类型的,这也是国外数字图书馆跨库检索的发展方向,不过国内的数据厂商相对比较封闭,不容易开展!

待续…


Technorati : ,

数字图书馆的检索问题- –

继续学习Modern Information Retrieval中与近期兴趣有关的部分:元搜索、数字图书馆的基本问题、知识组织等。

Modern Information Retrieval 提供了一种从计算机科学看数字图书馆的角度:

数字图书馆是:

作者并认为由于数字图书馆的跨地域性,多语种问题是数字图书馆的首要问题。解决多语种问题首先是字符集问题,字符集可以通过网络下载来解决;同时跨语种检索也是一个很重要的待解决问题。 QBIC 和可视化浏览和视觉辅助等技术有助于实现跨语种检索问题。

多媒体检索也是数字图书馆的核心技术之一。

把文件作为数字图书馆的结构单元,文件的结构及其元数据能够为数字图书馆提供微观的结构和语义。结构和语义是数字图书馆最重要的内容。

数字图书馆中的资源可能物理或者逻辑地不在一处,解决分布环境中的检索问题是数字图书馆有一个重要课题。

分布环境中的检索问题可以有两种方案解决:

其中联邦检索( Federated search )的意思为:

Federated search is the support for finding items that are scattered among a distributed collection of information sources or services, typically involving sending queries to a number of servers and then merging the results to present in an integrated, consistent, coordinated format.

对于联邦检索目前的称呼有很多,元搜索、跨库检索等等都是,其具体流程、步骤是否有什么不同未及深究,可能也应该了解一下。现代情报检索里附了一张图示,作为一个实用系统( BioKleili )的例子。

(无法贴图?)

可见与目前 NISO 组正在制定的 Metasearch 标准是何其相似。

联邦检索的具体步骤, Ricardo 和 Berthier 的书中是这样阐述的:

略有些模糊和不知所云。相比较而言中山大学计算机专业一个硕士(杜剑峰)的学位论文倒是研究得比较仔细:

另外还需参考一些近期的国外论文。


Technorati :

Modern Information Retrieval- –

智利的 Ricardo Baeza Yates 和巴西的 Berthier Ribeiro-Neto 两位计算机教授 1999 年著作的《 Modern Information Retrieval 》一书近年来被引率很高,许多学者都给与了很高的评价,成为许多学校的教科书或者必读书。从网上下载了该书的引言、第一章和第十章,感到确实不错,结构清晰,主要是内容比较新。相比较而言,国内情报检索课程所授,除了老套的东西,就是一些不伦不类的东西了。

查了馆藏书目,居然有藏,节后去借了来。

网址: http://sunsite.dcc.uchile.cl/irbook/


Technorati : ,

领域本体——广域网信息检索- –

感觉做论文时间紧迫,过年也得好好抓紧。

梳理思路:

论文的选题领域实际上是广域网的信息搜索问题,问题域集中在数字图书馆作为”一种”广域网的信息环境(首先必须定义清楚),希望利用语义万维网的一些思想来解决,包括利用元数据和知识本体的思想。

需要对自己要解决的问题领域先有一个本体:

因此先得找一些综述文档来看看。


Technorati :

台湾大学“知识组织”课件参考- –

因要给研究生班开设”知识组织与元数据”课程,系里没有指定教材,目前似乎也没有合适的教材,最近在准备课程内容时发现台湾大学咨询学系(也就是陈雪华教授那里)2003年就开设了类似的课程,名为”知识组织”,且所有课件都可以下载,狂喜。(参见http: //ceiba3.cc.ntu.edu.tw/course/cb9879/)。

看了台大的课程内容, 总的感觉,台大的”知识组织”更加偏重”知识管理”中所需的知识组织,也就是说时下比较热门的、用于许多知识型企业(咨询公司、 IT 研发企业等)的知识组织,而不是源自于哲学认识论、逻辑学或者计算机科学中的知识表示和操纵。因而看起来像是图书馆学、计算机科学与管理学的交叉。内容非常丰富,也很实用,然而就学科体系来说略感凌乱,如果想通过这门课的教授整理一份教材,还需要下不少功夫。
而且毕竟是2003年以前的内容,”知识本体”这两年进展颇多,课程的资料略显陈旧。

北大要求给研究生上课不必详细讲授知识内容,面面俱到,只要有一个大纲,让研究生掌握框架,然后去自学,并且在实践中总结。台大的课件好像也不太符合这个要求,象是给本科生上课。但是我的教材内容还是要准备得尽可能详尽,讲授的时候可以灵活掌握。这样做一方面便于自己形成一些研究课题,也方便学生拿到课件后能够进行自学,并进一步选择研究方向。

重新看一下我准备的课件,元数据部分还是强调的太多,脱胎于元数据讲座,而不是从知识组织角度,更能讲清楚元数据的作用和来龙去脉。


Technorati : , , ,