分类目录知识组织

ADL:一个古老但可资参考的例子- –

加州大学圣巴巴拉分校牵头的” Alexandria Digital Library “项目从 DLI1 就开始作,目前应该说已经基本完全结束。其中涉及到数字图书馆体系结构、分布式资源组织管理、资源集合元数据应用等都是我比较关心的,只有一项:地理信息的规范控制和管理却不是我的重点。

亚历山大数字图书馆项目在 1999 年就提出了资源集合描述元数据的各项功能,现在实际上还是沿着这条道路继续标准化,然而进展看起来并不是很大。 ADL 当初的陈述如下(见 1999 年的一篇文章: http://www.alexandria.ucsb.edu/%7Egjanee/archive/1999/jasis-paper.pdf Linda Hill etc. Collection Metadata Solutions for Digital Library Applications ):

The Alexandria Digital Library (ADL) Project has designed and implemented collection metadata for several purposes: in XML form, the collection metadata “registers” the collection with the user interface client; in HTML form, it is used for user documentation; eventually, it will be used to describe the collection to network search agents; and it is used for internal collection management, including mapping the object metadata attributes to the common search parameters of the system.

现在看起来 ADL 中的资源集合元数据有许多”不规范”的地方:即为了实现功能而”任意”添加的属性。由于其应用平台为 C/S 结构,编码虽然是 XML 格式,但是 Vocabulary 是自定义的。数字对象的描述是封装于 Bucket 中,规定了 Bucketde 的类型和结构,以及一个 Core Bucket 。

一些想法:

康奈尔、加大圣塔巴巴拉、斯坦福三个高校的 DLI2 项目与我的论文有关。涉及的主要内容有:

符号、数据、信息、知识、智慧、精神- –

我们这辈人大概能够迎来个人”拥有”全部人类知识,甚至”记录”每一个个体知识的时代。但是”拥有”海量知识并不会自动地使我们拥有智慧,我们必须懂得如何运用知识,操控知识,掌握知识之间的联系,方能比前人更聪明。

数字图书馆通过信息组织管理知识,从信息到知识是一个神秘的过程,决定于人的认知过程。因此对计算机来说,其能力永远体现在信息处理方面,而通过各种各样的工具,展现给人的,就变成了知识。(深入探讨这个问题就进入了哲学范畴,在此略过),

数据

知识

智慧

富有洞察力的知识,在了解多方面的知识后,能够预见一些事情的发生和采取行动。譬如大家都觉得国庆长假去杭州旅游的车票非常紧张(知识),但你已经非常有预见性地购买了车票,领先一步(智慧)。智慧是利用知识采取正确行动的体现。

知识的分类

隐性知识和显性知识

know-X 知识分类法

从企业知识管理的角度研究知识形态的转化和循环过程:

上述过程,包含有三个层次的内容:

知识循环的两大主要过程是共享和创新。

创新过程:

1. 创造:就是创造出新思想。知识网络从不同角度促进思想的交叉运用,所以常常会推动创新循环过程。

2. 编辑:样板涉及和工艺说明等在这一阶段出台。这一环节将思想整理成更容易流通的形式。

3. 嵌入:在这个阶段模型得到进一步完善,另外生产过程和企业规程中也纳入了模型的相关知识。

4. 传播:将产品推向市场或者在企业内执行新的工艺和流程。

共享过程

1 .收集:按照常规或根据需要收集现有知识。通常是知识目录或者知识谱的形式

2 .整理 / 存储:将知识归类并存储起来,经常用一种企业特有的数据库或者归类模式。这样能使随后的知识检索更为方便。这一过程一般要靠专业人员的帮助。

3 .共享 / 传播:按常规将信息传给需要的那些人,这是信息的”推广”。各种会议和活动起到了共享隐性知识的工具作用。

4 .存取:使文件服务器或者数据库服务器中的信息读取更加方便,用户需要时就可以”接入”信息。

5 .使用 / 利用:把知识作为工作流程的一部分。知识得到提炼和发展。通过应用,创造出额外知识,同事循环得以重复。

孔繁胜的《知识库系统原理》则是从计算机科学的角度对于知识进行了分层和分类。

知识包含四层:

以及两类:

什么是知识?什么是学习?这些概念离我们很近,却又十分的遥远.

1.哲学的视角
石中英老师(北师大教育学系主任,1967年出生)在《知识转型与教育改革》一书在导言(知识与教育)和第一章(知识、知识型与知识转型)以及第二章(人类历史上的三次知识转型)从哲学的层次对什么是知识做了系统的文献综述和深入分析。

摘出部分概括如下:
历史上对知识的概念的回答一般会涉及以下关系:
知识与认识者的关系(个体/社会身份),积极主动/消极被动,理性主导/感性主导,陈述/信念);
知识与认识对象的关系(知识是外部世界的镜式反映,真正知识与外部世界相符合,知识真理地位的经验证据是否充分,认识对象是否在认识之前就客观存在);
知识作为陈述本身的逻辑问题(知识有没有统一或标准的陈述形式,概念和命题是逻辑的构造还是历史文化的产物,不同领域知识的概念和命题有何不同特征,如何为陈述知识辩护);
知识与社会的关系(知识是价值中立吗,科学研究活动是纯粹理智行为吗,知识与利益、权力、意识形态、性别等关系,知识生活中知识生产、传播、配置如何受社会因素制约);

2.经济的视角
知识经济;知识管理;

3.社会的视角
知识创新与知识传播;知识的社会性;

4.计算的视角
中科院计算机语言信息中心董振东教授在2003年11月”知识的计算与 《知网》”讲座中说:
“知识是一个系统,它揭示了概念与概念之间,以及概念的属性与属性之间的关系;知识体系的广度与深度取决于上述关系的多少
对于面向计算机的知识体系的质量的关键是它的可计算性以及由此为具体的应用而能够提供的服务。”
并且给出了知识的基元–概念/属性的具体描述,以及可计算的知识系统《知网》设计机制与国际应用情况。


Technorati : , ,

DC2005和ECDL2005- –

今年DC元数据年会与欧洲数字图书馆会议ECDL将于9月中下旬相继召开,前者在西班牙首都马德里,后者在音乐之都维也纳召开。会议征文通知预示着这一领域人们关心的论题的细微变化。
对于DC来说,今年是其十周年,应该有一定的庆祝意义。
如果我要投稿,利用FOAF建设人名规范档是一个很好的题目,结合数字图书馆分布式体系结构、元数据应用、词表规范控制等,有理论,有实践。
DC2005的官方网站:http://dc2005.uc3m.es/
ECDL2005的官方网站:http://www.ecdl2005.org/

DC2005 CFP

Metadata based on standards such as Dublin Core are a key component of information environments from scientific repositories to corporate intranets and from business and publishing to education and e-government.

DC-2005 – the fifth in a series of conferences previously held in Tokyo (2001), Florence (2002), Seattle (2003), and Shanghai (2004) – will examine the practicalities of maintaining and using controlled sets of terms (“vocabularies”) in the context of the Web.

DC-2005 aims at bringing together several distinct communities of vocabulary users:

These diverse communities share common problems, from the the use of identifiers for terms to practices for developing, maintaining, versioning, translating, and adapting standard vocabularies for specific local needs. Topics of particular relevance include:

The Program Committee would like to solicit contributions of the following types:

Paper submissions will be peer-reviewed by the program committee and published both in print and electronically in the conference proceedings. All accepted papers must be presented at the conference by at least one of their authors.

The official language of the conference is English, but we will provide simultaneous translation (English-Spanish) for keynotes, tutorials, and plenary sessions.


Technorati :

JCDL-NKOS Workshop 2005- –

今年 IEEE 和 ACM 联合的 JCDL 将于 6 月中旬在丹佛召开, NKOS 又有一个 Joint Session 。值得一提的是曾蕾和秦健是这个组织的核心成员。下面是他们的工作组计划,也是一个 Announcement ,可以看看今年的动向是什么。

Integration Challenges and Strategies

The 7th Networked Knowledge Organization Systems (NKOS) Workshop

Gail Hodge

Linda Hill

Jian Qin

Douglas Tudhope

Marcia Zeng

ABSTRACT

This year's Networked Knowledge Organization Systems (NKOS) workshop builds on seven years of workshops in the U.S. and Europe on issues enabling networked knowledge organization systems (KOS), such as classification systems, thesauri, gazetteers, taxonomies, and ontologies, to support the description, retrieval, and use of diverse information resources. Now many efforts are underway to research the issues and implement solutions to the challenges of networking and integrating KOS somewhat isolated domains: indexing services and thesaurus builders; computer scientists and system integrators; ontologists; taxonomists; and others. Requirements to solve these integration issues have become mission critical in many cases; the need to support computational, programmatic integration to handle masses of data from independent sources is pushing the research and development agenda. The need to move forward to meet these challenges while at the same time applying the best practices and “wisdom” developed through years of practical experience is acute.

The JCDL-NKOS workshop for 2005 will bring together researchers and implementers from diverse international communities who are developing new models, conducting research, and implementing practical solutions for networking KOS and integrating the associated information and data resources.

Topics may include:

Keywords

Controlled Vocabularies, Thesauri, Topic Maps, Ontologies, Networked Knowledge Organization Systems

AIM AND OBJECTIVES

The primary aim of the workshop is to inform NKOS researchers and practitioners about developments across a number of communities and to identify research and development directions. The objectives are to encourage sharing of new initiatives and lessons learned and to identify collaborative development opportunities.

PROGRAM

Session 1: Welcome and introduction to NKOS and to the workshop

Session 2: Self-introductions and brief descriptions of projects and interests particularly as they relate to the topic of integration and interoperability

Session 3: Case studies about the interoperability and integration of KOSs (approximately 2 presentations)

Session 4 : Presentations on methodologies, tools and strategies (approximately 2)

Session 5 : Presentations on recent related standards activities (approximately 2)

Session 6: Open discussion of the issues raised in the previous sessions

Session 7: Identification of a research agenda based on reactions from a panel of practitioners, software developers, academics and standards developers

WORKSHOP FORMAT

The full-day workshop will include invited and accepted presentations, guided by a program committee. Presentations will be grouped into topic sessions and demonstrations can be set up for access before and after the workshop and during breaks. Discussion and identification of issues will be encouraged by providing a significant amount of time for open discussion and networking opportunities. Participants will be given the opportunity to introduce their work and their interests. The identification of a research agenda will involve significant facilitated discussions.

ATTENDEES

The workshop will be announced via the NKOS listserv, which now has over 100 members from more than 10 countries. In addition, other relevant listservs and groups, such as ASIST-L, ASIST SIG-CR, ECDL, Dig-Lib, ASI, and standards-related groups will be notified. Approximately 25-35 participants are expected, though the organizers would prefer not to limit attendance. If necessary, the participants will be accepted in the order of registration. As in the past, the organizers will work with the JCDL organizers to coordinate logistics and monitor registrations.

WORKSHOP ORGANIZERS (略)

PREVIOUS NKOS ACTIVITIES

This is the 7th in the series of NKOS workshops held in conjunction with the JCDL. Past topics have included protocols for networked KOS, requirements for electronic thesauri, digital gazetteers, and moving traditional KOS into semantic web concepts. Attendance has ranged from 15 to over 50 people. NKOS is an ad hoc working group of academics and practitioners interested in various knowledge organization systems in networked environments. The listserv, hosted by NSF DL2, includes over 100 professionals from more than 10 countries. The NKOS Web site ( http://nkos.slis.kent.edu ) is hosted by Kent State Univ. NKOS-related sessions have been held at ECDL. The ECDL and JCDL NKOS sessions have resulted in proceedings which are available from the NKOS Web site, and in special issues of the Journal of Digital Information.


Technorati : , ,

Meta-search: SRW/U会成为NISO Metasearch的标准?- –

1、 Dlib Mag 刊出了一篇 SRW/U 的文章,把 SRW/U 与 OAI 的 protocol 进行对比,并提出了兼用这两种协议的方法。使我想到应该在”知识组织”课程里介绍这两种协议,同时介绍他们的相容性。

2、 SRW/U 作为 Z39.50 的 Web/XML 版本,有彻底的脱胎换骨,实际上 Z ( Zing )的功能被一系列新的协议所取代,而不是仅仅一个 SRW/U 。可参考网页: http://www.loc.gov/z3950/agency/zing/

3、 也是在本期 dlib 杂志上看到一个 OLAC 项目的元数据方案,采用比较规范的 DC Profile 形式在网上公开,可以供我们的项目参考。
http://www.language-archives.org/OLAC/metadata.html
http://www.language-archives.org/REC/olac-extensions.html

4、 NISO 的 Metasearch Initiative ( http://www.niso.org/committees/MetaSearch-info.html )与 Zing 到底是一种什么关系?可能 NISO 希望 Zing 的开发可以作为下一代 MetaSearch 的标准吧。

2005/2/23补记:

看到年心搏客( http://hjn66.blogchina.com/ )里对元搜索的一种区分,好像有点道理,不知是不是国内的普遍认识?

2. 整合检索:将各个数据库的元数据套录出来组成新的二次文献库,对源文件进行链接管理,这种方式技术难度大,需要数据厂商的支持。 TRS 就是这个类型的,这也是国外数字图书馆跨库检索的发展方向,不过国内的数据厂商相对比较封闭,不容易开展!

待续…


Technorati : ,

数字图书馆的检索问题- –

继续学习Modern Information Retrieval中与近期兴趣有关的部分:元搜索、数字图书馆的基本问题、知识组织等。

Modern Information Retrieval 提供了一种从计算机科学看数字图书馆的角度:

数字图书馆是:

作者并认为由于数字图书馆的跨地域性,多语种问题是数字图书馆的首要问题。解决多语种问题首先是字符集问题,字符集可以通过网络下载来解决;同时跨语种检索也是一个很重要的待解决问题。 QBIC 和可视化浏览和视觉辅助等技术有助于实现跨语种检索问题。

多媒体检索也是数字图书馆的核心技术之一。

把文件作为数字图书馆的结构单元,文件的结构及其元数据能够为数字图书馆提供微观的结构和语义。结构和语义是数字图书馆最重要的内容。

数字图书馆中的资源可能物理或者逻辑地不在一处,解决分布环境中的检索问题是数字图书馆有一个重要课题。

分布环境中的检索问题可以有两种方案解决:

其中联邦检索( Federated search )的意思为:

Federated search is the support for finding items that are scattered among a distributed collection of information sources or services, typically involving sending queries to a number of servers and then merging the results to present in an integrated, consistent, coordinated format.

对于联邦检索目前的称呼有很多,元搜索、跨库检索等等都是,其具体流程、步骤是否有什么不同未及深究,可能也应该了解一下。现代情报检索里附了一张图示,作为一个实用系统( BioKleili )的例子。

(无法贴图?)

可见与目前 NISO 组正在制定的 Metasearch 标准是何其相似。

联邦检索的具体步骤, Ricardo 和 Berthier 的书中是这样阐述的:

略有些模糊和不知所云。相比较而言中山大学计算机专业一个硕士(杜剑峰)的学位论文倒是研究得比较仔细:

另外还需参考一些近期的国外论文。


Technorati :

台湾大学“知识组织”课件参考- –

因要给研究生班开设”知识组织与元数据”课程,系里没有指定教材,目前似乎也没有合适的教材,最近在准备课程内容时发现台湾大学咨询学系(也就是陈雪华教授那里)2003年就开设了类似的课程,名为”知识组织”,且所有课件都可以下载,狂喜。(参见http: //ceiba3.cc.ntu.edu.tw/course/cb9879/)。

看了台大的课程内容, 总的感觉,台大的”知识组织”更加偏重”知识管理”中所需的知识组织,也就是说时下比较热门的、用于许多知识型企业(咨询公司、 IT 研发企业等)的知识组织,而不是源自于哲学认识论、逻辑学或者计算机科学中的知识表示和操纵。因而看起来像是图书馆学、计算机科学与管理学的交叉。内容非常丰富,也很实用,然而就学科体系来说略感凌乱,如果想通过这门课的教授整理一份教材,还需要下不少功夫。
而且毕竟是2003年以前的内容,”知识本体”这两年进展颇多,课程的资料略显陈旧。

北大要求给研究生上课不必详细讲授知识内容,面面俱到,只要有一个大纲,让研究生掌握框架,然后去自学,并且在实践中总结。台大的课件好像也不太符合这个要求,象是给本科生上课。但是我的教材内容还是要准备得尽可能详尽,讲授的时候可以灵活掌握。这样做一方面便于自己形成一些研究课题,也方便学生拿到课件后能够进行自学,并进一步选择研究方向。

重新看一下我准备的课件,元数据部分还是强调的太多,脱胎于元数据讲座,而不是从知识组织角度,更能讲清楚元数据的作用和来龙去脉。


Technorati : , , ,

一些有关知识/信息组织的图书、论述- –

See follows:

知识 / 信息组织相关经典外文书籍和论述

Anderson, J. D. (2003). Organization of knowledge. IN: International Encyclopedia of Information and Library Science. 2nd . ed. Ed. by John Feather & Paul Sturges. London: Routledge (pp. 471-490).

Bade, D. (2002). The Creation and Persistence of Misinformation in Shared Library Catalogs: Language and Subject Knowledge in a Technological Era . David Bade, Urbana, IL: Graduate School of Library and Information Science, University of Illinois; (ISBN: 0-87845-120-X.)

Bliss, H. E. (1929). The organization of knowledge and the system of the sciences . By Henry Evelyn Bliss ; with an introduction by John Dewey. New York: Henry Holt and Company.

Bliss, H. E. (1934). The Organization of Knowledge in Libraries and the subject-approach to books . New York: The H. W. Wilson Company.

Bliss, H. E. (1935). A system of bibliographical classification . New York: The H. W. Wilson Company.

Capurro, R & Hjørland, B. (2003). The Concept of Information. Annual Review of Information Science & Technology, Vol. 37 , Chapter 8, pp. 343-411.

Dewey, J. (1929). Introduction. IN: H. E. Bliss: The organization of knowledge and the system of the sciences . New York, Holt (pp. vii-ix).

Feger, H. (2001). Classification: Conceptions in the Social Sciences. International Encyclopedia of the Social & Behavioral Sciences, Vol. 3, pp. 1966-1973 . Amsterdam: Elsevier Science, Ltd. (Online version with abstract published 2002)

Frohmann, Bernd. (1990). Rules of Indexing: A Critique of Mentalism in Information Retrieval Theory. Journal of Documentation , 46: 81-101.

Frohmann, B. (2003). Grounding a theory of documentation. Paper presented at DOCAM '03 The first annual meeting of the Document Academy. August 13-15, 2003 at The School of Information Management and Systems (SIMS) at The University of California, Berkeley. http://thedocumentacademy.hum.uit.no/events/docam03.abstract s/bernd.frohman.html

Frohmann, B. (2004). Deflating Information. From Science Studies to Documentation . University of Toronto Press.

Furner, J. (2004). Information studies without information . LIBRARY TRENDS , V52, N3 (WIN), P427-446.

Hjørland, B. (2002). Domain analysis in information science. Eleven approaches – traditional as well as innovative. Journal of Documentation, 58 (4), 422-462.

Hjørland, B. (2002). Principia Informatica. Foundational Theory of Information and Principles of Information Services. IN: Emerging Frameworks and Methods. Proceedings of the Fourth International Conference on Conceptions of Library and Information Science (CoLIS4) . Ed. By Harry Bruce, Raya Fidel, Peter Ingwersen, and Pertti Vakkari. Greenwood Village, Colorado, USA: Libraries Unlimited. (Pp. 109-121).

HULME, E. WYNDAM. 1911a. Principles of Book Classification: Introduction. In: Hulme, E. Wyndam. Library Association Record, 1911; 13: 354-358.

HULME, E. WYNDAM. 1911b. Principles of Book Classification: Chapter II – Principles of Division in Book Classification. In: Hulme, E. Wyndam. Library Association Record, 1911; 13: 389-394.

HULME, E. WYNDAM. 1911c. Principles of Book Classification: Chapter III – On the Definition of Class Headings, and the Natural Limit to the Extension of Book Classification. In: Hulme, E. Wyndam. Library Association Record, 1911; 13: 444-449

ISO 5127: 2001 Information and Documentation – Vocabulary. International Standards Organization.

Miksa, F. (1998). The DDC, the Universe of Knowledge, and the Post-Modern Library­. Albany, NY: Forrest Press.

Richardson, E. C. (1930/1964). Classification: Theoretical and practical . New York: The H. W. Wilson Co., 1930. (Reprinted unaltered 1964)

Sayers, W. C. (1915). Canons of classification applied to “the subject” “the expansive”, “the decimal” and “the Library of Congress” classifications : a study in bibliographical classification method. London: Grafton & Co.

Smiraglia, R. P. (2001). The nature of “a work”: implications for the organization of knowledge. Lanham, Md.: Scarecrow Press.

Sowa, J. F. (2000). Knowledge representation : logical, philosophical, and computational foundations . Pacific Grove, California: Brooks/Cole.

Spang-Hanssen, H. (2001): How to teach about information as related to documentation. Human IT. 2001, (1), pp. 125-143. http://www.hb.se/bhs/ith/1-01/hsh.htm (Visited April 13, 2004).

Steen Larsen, P. (2003). Terms and definitions related to the information process, drawn from ISO 5127: 2001 Information and Documentation – Vocabulary. Unpublished paper.

Svenonius, E. (2000). The Intellectual Foundation of Information Organization . Cambridge, Massachusetts: The MIT Press.

Taylor, A. G. (1999). The Organization of information. Englewood, Colorado: Libraries Unlimited.

Thellefsen, T.L. (2002). Semiotic knowledge organization: Theory and method development. Semiotica, 142 (1-4), 71-90.

Webber, S. (2003). Information science in 2003: a critique. Journal of Information Science , 29(4), 311-330.


Technorati : , , ,