|
|
|
|
|
|
|
|
|||
|
|
|
|
|
|
|
Sep 14 |
|
Web Intelligence, World Knowledge and Fuzzy Logic Location: 405 Soda Hall Details: |
|
Tuesday, September 21, 2004 |
|||
|
|
|
|
|
|
|
Sep 21 |
|
Text Summarization -- In Search of Effective Ideas and Techniques Location: 405 Soda Hall Details: |
|
|
|||
|
|
|
|
|
|
|
Sep 28 |
|
Decision Tree using Evolutionary Techniques: GA-GP-Based Fuzzy Decison Tree Model Location: 405 Soda Hall Details: |
|
|
|||
|
|
|
|
|
|
|
Oct 5 |
|
The Failure of Clustering in Search User Interfaces Location: 405 Soda Hall Details: |
|
Tuesday; Oct 26, 2004 |
|||
|
|
Oct 26 |
Tuesday; 1:30-2:30pm |
Semantics – the implicit, the formal and the powerful;
(with a case study in Glycomics) Location:
606 Soda Hall Details: Semantics has been recognized as the key to next generation of more powerful information systems for better search, integration, question/answering as well as analysis/discovery. Semantics has longbeen studied in many disciplines including linguistics, AI, IR, information and database systems, and soft computing, and a rich variety of approaches, techniques and tools have been developed. Morerecently, the Semantic Web community has made concerted effort in using semantics by defining standards for the modeling of knowledge based on Description Logic (DL) based languagessuch as OWL, and focused on corresponding reasoning techniques that have rather narrow set of applications. We view these recent approaches as addressing a subset of challenges,complemented by techniques that deal with broad variety of information and knowledge, expressiveness, computational capabilities and computability. In this talk we attempt to organize thisbroad variety of options from a pedagogic perspective that characterizes semantic approaches as implicit (such as those based on statistical and machine learning), formal (such as thosebased on DL to FOL) and powerful (such as those based on soft computing). To exemplify this perspective, we look at some examples from the domain of life sciences which offer rich sets of challenges due to the complexity of biological systems. More specifically welook at our current research in Glycomics that involve the requirements for creation of taxonomies and ontologies with higher expressive representations, semantic annotation of textualdata in heterogeneous formats as well as machine generated scientific data, semantic search of scientific literature, semantic integration of heterogeneous textual and scientific (nontextual)data, wrapping of data analysis tools with semantically annotated Web Services, and development of semantic web processes leading to better and quicker interpretation/analytics anddiscovery. In particular, we will offer concrete examples demonstrating the need for more expressive representation that follow the ideas offered by Prof. Zadeh in “Toward aperception-based theory of probabilistic reasoning with imprecise probabilities”. Among the novel research outcomes including automatic taxonomy generation in Taxaminer, developmentof a comprehensive ontology for Glycomics called GlycO, early efforts in semantic annotation of machine generated scientific data, and preliminary ideas about fpOWL, an extension to theontology language OWL that allows probabilistic and fuzzy reasoning. See project at the LSDIS lab for more information. Acknowledgement: Will York, Christopher Thomas, and other members of Bioinformatics for Glycan Express project team; Cartic Ramakrishnan and members of Taxaminer team. About the speaker: Amit Sheth is an Educator, Researcher and Entrepreneur. He joined the University of Georgia and started the LSDIS lab in 1994. Earlier, he served in R&D groups at Bellcore (now TelcordiaTechnologies), Unisys, and Honeywell. In August 1999, Sheth founded Taalee, Inc., a VC funded enterprise software and internet infrastructure startup based on the technology developed at theLSDIS lab. He managed Taalee as its CEO until June 2001. Following Taalee's acquisition/merger, he serves as the CTO and co-founder of Semagix, Inc. (formerly Voquette, Inc). His researchhas led to several commercial products and applications. He has published over 175 papers and articles (in the areas of semantic interoperability, federated databases, workflow management,Semantic Web), given over 130 invited talks and colloquia including 19 keynotes, (co)-organized/chaired twelve conferences/workshops, and served on over 90 program committees. He is amember of W3C Advisory Committee, SWSA, etc. http://lsdis.cs.uga.edu/~amit and http://www.semagix.com/company_team.html
|
|
Tuesday, Nov 9, 2004 |
|||
|
|
Nov 9 |
4:00-5:30pm |
Web Search as a Computational Challenge Location: 306 Soda Hall Details: |
|
|
|||
|
Wednesday, November 10, 2004 |
|||
|
|
Nov 10 |
1:00-2:30pm |
Social network analysis of text Location: 380 Soda Hall Seminar Speaker Name: Dragomir R. Radev Seminar Speaker Affil.: University of Michigan Seminar Series: BISC seminar For more information: http://www-bisc.cs.berkeley.edu Details: Textual data is everywhere, in email and scientific papers, in online newspapers and e-commerce sites. The Web contains more than 200 terabytes of text not even counting the contents of dynamic textual databases. This enormous source of knowledge is seriously underexploited. Textual documents on the Web are very hard to model computationally: they are unstructured, time-dependent, collectively authored, and of uneven importance. Traditional grammar-based techniques don't scale up to address such problems. Novel representations and analytical tools are needed. NewsInEssence (www.newsinessence.com) is a system that crawls the Web for news, automatically clusters them by topic, and produces user-defined extractive summaries of each cluster. A recent addition to the battery of summarization algorithms available to NewsInEssence is the Cosine Centrality method. In this talk I will describe how one can apply the theory of social networks and stochastic processes (in particular rank-based prestige and random walks on undirected graphs) to multi-document text summarization. (I will begin my talk with a short tutorial on the mathematics needed for the rest of the talk.) If time permits, at the end of the talk, I will quickly describe two recent ongoing projects in my research group: one on machine learning for object classification using random walks on bipartite (feature-object) graphs and another on using phylogenetic techniques for fact tracking in evolving multi-document summarization. -------------------------------------------- Short Bio: ========== Dragomir R. Radev is Assistant Professor of Information, Electrical Engineering and Computer Science, and Linguistics at the University of Michigan, Ann Arbor. He leads the CLAIR (Computational Lingusitics And Information Retrieval) group which currently includes 12 undergraduate and graduate students. Dragomir holds a Ph.D. in Computer Science from Columbia University. Before joining Michigan, he was a Research Staff Member at IBM's TJ Watson Research Center in Hawthorne, NY. He is the author of more than 45 papers on information retrieval, text summarization, graph models of the Web, question answering, machine translation, text generation, and information extraction. Dr. Radev's current research on probabilistic and link-based methods for exploiting very large textual repositories, representing and acquiring knowledge of genome regulation, and semantic entity and relation extraction from Web-scale text document collections is supported by NSF and NIH. Dragomir serves on the HLT-NAACL advisory committee, was recently reelected as treasurer of NAACL, is a member of the editorial boards of JAIR and Information Retrieval, and is a four-time finalist at the ACM international programming finals (as contestant in 1993 and as coach in 1995-1997). Dragomir received a graduate teaching award at Columbia and recently, the U. of Michigan award for Outstanding Research Mentorship (UROP).
|
|
Tuesday, Nov 30, 2004; 405 Soda Hall, 4:00-5:30pm |
|||
|
|
Tuesday, Nov 30 |
4:00-5:30pm |
Title:
Uncertainty in an unknown world Prof.
Stuart Russell Computer
Science Division, University of Description:
Recent advances in knowledge representation for probability models have
allowed for uncertainty about the properties of objects and the relations
that might hold among them. Such models, however, typically assume
exact knowledge of which objects exist and of which object is which---that
is, they assume *domain closure* and *unique names*. These
assumptions greatly simplify the sample space for probability models,
but are inappropriate for many real-world situations. This talk
presents a formal language, BLOG, for defining probability models over
worlds with unknown objects and in which several terms may refer to
the same object. The language has a
simple syntax based on first-order
logic, combined with local probability functions for quantifying
conditional dependencies. A key additional element is the *number*
statement, which specifies a conditional distribution over the
number of objects that satisfy a given property. Subject to certain
acyclicity constraints, every BLOG model specifies a unique probability
distribution over the full set of possible worlds for the first-order
language. Furthermore, complete inference algorithms exist for
a large fragment of the language. I will present several example models
and discuss interesting issues arising from the treatment of evidence in such languages. |
|
Tuesday, Nov 30, 2004; 405 Soda Hall, 4:00-5:30pm |
|||
|
|
Nov 30, 2004 |
4:00-5:30pm; 405 Soda Hall |
Recent
Research in Cross-language Document Search Fredric
Gey University
of California, Berkeley Abstract Cross-language
document search research has been underway for more than 10 years now and
while much progress has been made, certain research challenges remain. This talk will review recent
research in Cross-language information retrieval, including the 2004
evaluation workshops: NTCIR for Asian language retrieval in Japan
(http://research.nii.ac.jp/ntcir-ws4/index.html) and
CLEF for European language retrieval (http://clef.iei.pi.cnr.it:2002/), as
well as the U.S. DARPA “Hindi Surprise Language Exercise” of 2003. Topics to be covered include: Language-specific processing
(stemming, segmentation, stop-words) Word decompounding for German Translation disambiguation for
bilingual dictionaries Parallel corpora induced lexicons Web corpora usage for out-of-vocabulary translation Special retrieval tasks (Patent
Retrieval, Cross-language question answering Geographic information retrieval Challenges of less-commonly taught
languages The road ahead in cross-language
information retrieval research Presenter: Dr. Fredric Gey has been doing research in
cross-language information retrieval since 1998. He and his associates have participated in every cross-language
information retrieval evaluation
in the United States, Japan and Europe.
Currently he is working on retrieval (including geographic information
retrieval) of Russian language corpora and other digital objects. Dr.
Gey co-chaired the English-Arabic retrieval evaluation track at the TREC conferences
in 2001 and 2002. He co-chaired a
workshop on “Cross-language Information Retrieval Research: The
Road Ahead” at the ACM SIGIR-2002 conference in Finland. He is co-author of the entry on
“Multilingual Information Retrieval” in the Encyclopedia of Library and
Information Science and co-editor of a forthcoming special issue on Cross-Language Information Retrieval of the Information Processing and Management Journal. |
Notify the calendar administrator of a change to
an existing EE or CS calendar entry.
Powered by WebEvent (tm).
College of Engineering
| Bioengineering | Civil & Environmental Engineering | Electrical Engineering & Computer Sciences
| Industrial Engineering & Operations Research
| Materials Science & Engineering
| Mechanical Engineering | Nuclear Engineering
|
|