stargeek
PHP news website logo.
home    PHP scripts    articles    seo tools    links    search    contact    shop    realtors


Corpus Structure, Language Models, and Ad Hoc Information Retrieval







Corpus Structure, Language Models, and
Ad Hoc Information Retrieval

Corpus Structure, Language Models, and
Ad Hoc Information Retrieval
05/19/2004 04:47 AM

Corpus Structure, Language Models, and Ad Hoc Information Retrieval by Oren Kurland and Lillian Lee
http: //eprints.osti.gov/cgi-bin/dexpldcgi?qry1131250613;1

Abstract by Authors:
Most previous work on the recently developed language-modeling approach to information retrieval focuses on document-specific characteristics, and therefore does not take into account the structure of the surrounding corpus. We propose a novel algorithmic framework in which information provided by document-based language models is enhanced by the incorporation of information drawn from clusters of similar documents. Using this framework, we develop a suite of new algorithms. Even the simplest typically outperforms the standard language-modeling approach in precision and recall, and our new interpolation algorithm posts statistically significant improvements for both metrics over all three corpora tested.




This is a GrokNews Entry: (what is grok?)





Similar Items

Corpus Structure, Language Models, and Ad Hoc Information Retrieval

Grok Headline matches for Corpus Structure, Language Models, and Ad Hoc Information Retrieval

Natural Language Processing /
Information Retrieval Software
Repository


Natural Language Processing /
Information Retrieval Software
Repository
05/12/2004 07:09 AM
Natural Language Processing / Information Retrieval Software Repository
http://www.comp.nus.edu.sg /~rpnlpir/

This directory and account holds centralized software and tools for natural language processing (NLP) and information retrieval (IR) research and teaching at the School of Computing at the National University of Singapore. The account is hosted off of sf3 such that students and researchers will be able to get at these tools. Access is granted to all, however, if you'd like to provide and/or install tools, you must first email the administrators (rpnlpir@comp.... This directory has ben added to Directory Resources Subject Tracer™ Information Blog.

The Necessity for Information Space
Mapping for Information Retrieval on the
Semantic Web


The Necessity for Information Space
Mapping for Information Retrieval on the
Semantic Web
08/13/2002 10:03 AM

Information Retrieval Toolkit


Information Retrieval Toolkit 05/16/2004 03:22 AM
Development is in 'new' subdirectory

Collaborative Information Retrieval


Collaborative Information Retrieval 08/16/2002 05:51 AM

Information Retrieval Software


Information Retrieval Software 06/03/2004 05:27 AM
Information Retrieval Software
http://www.ir-ware.biz/

This web site provides links to information retrieval software. You will find links to search engines to search the web, site search engines, local drives, or local area networks. They provide links to web sites with related subjects, to Internet search engines, web directories, and to online publications on information retrieval. You will find more than 3,000 links in their directory. There are lots of search utilities to discover on the web. You can find all kinds of software to search a site, often written in PERL, JAVA, or JavaScript. There are also web robots or web spiders which can be used to search the web, but also to search a local area network or a local hard drive. This has been added to Directory Resources Subject Tracer™ Information Blog. This will be added to the search engines section of the Internet MiniGuides.

After the Dot-Bomb: Getting Web
Information Retrieval Right This Time


After the Dot-Bomb: Getting Web
Information Retrieval Right This Time
07/11/2002 08:01 AM

Music Information Retrieval Systems


Music Information Retrieval Systems 04/27/2004 08:12 PM
DDJ Apr 27 2004 11:03PM GMT

Brief: Oracle buys information search
and retrieval software


Brief: Oracle buys information search
and retrieval software
06/17/2005 04:29 PM
Computerworld Jun 17 2005 8:15PM GMT

Web-Based Information Retrieval Support
Systems (WIRSS)


Web-Based Information Retrieval Support
Systems (WIRSS)
01/05/2004 05:40 PM
Web-Based Information Retrieval Support Systems (WIRSS): Building Research Tools for Scientists in the New Information Age
http://www 2.cs.uregina.ca/~jtyao/Papers/Web_WI03.pdf

Abstract:

The concept of Web-based Information Retrieval Support Systems (WIRSS) is introduced. The needs for WIRSS are shown by a detailed case study of existing research article indexing and citation analysis systems, such as Curent Content, DBLP, Science Citation Index and CiteSeer. The objective of WIRSS is to build new and effective research tools for scientists to access, explore and use information on the Web, which may lead to improved research productivity and quality.

The document triangle: The
interdependence of the structure,
information and presentation dimensions


The document triangle: The
interdependence of the structure,
information and presentation dimensions
11/06/2003 04:06 AM

Business Models for Information Design
and Development Departments


Business Models for Information Design
and Development Departments
08/05/2002 11:45 PM

Cognitive Models for Web Design:
Information Foraging Theory Applied


Cognitive Models for Web Design:
Information Foraging Theory Applied
12/11/2002 08:09 AM

Leap SE 2.0 turns System Requirements
into Object Models and Data Models,
Automatically


Leap SE 2.0 turns System Requirements
into Object Models and Data Models,
Automatically
08/16/2004 02:37 AM
Leap SE—a CASE tool that generates object models directly from system requirements—now generates a data model as well, dramatically shortening the systems analysis phase of software development projects. [PRWEB Aug 16, 2004]

Print 3D models to cut-and-glue paper
models


Print 3D models to cut-and-glue paper
models
12/19/2004 03:34 PM
Cory Doctorow: Jason sez, "Knowing how crazy Boing Boing readers are for origami and paper models, I thought you might be interested in Pepakura Designer, which lets you print out plans for paper models from objects designed in common 3D modelers... The demo has the save featured disabled, but you can still print your objects. It works with objects from 3D Studio, Lightwave, AutoCAD and a few others." Link (Thanks, Jason!)

Bush's War on Habeas Corpus


Bush's War on Habeas Corpus 03/08/2004 11:18 PM
Put simply, we are being told that the president, solely on his say-so (or, in real life, the say-so of anyone he designates), can imprison anyone for any period of time -- and without any right to a lawyer or judicial review.

Uplug corpus tools


Uplug corpus tools 06/17/2004 06:42 AM
inital release

LDAPUserGroups by makina-corpus on
2002/07/01


LDAPUserGroups by makina-corpus on
2002/07/01
07/01/2002 10:33 AM

Corpus of Electronic Texts (CELT)


Corpus of Electronic Texts (CELT) 04/11/2004 06:42 AM
Corpus of Electronic Texts (CELT)
http://www.ucc.ie/celt/index. html

Developed at University College Cork, the Corpus of Electronic Texts project is intended "to bring the wealth of Irish literary and historical culture (in Irish, Latin, Anglo-Norman French, and English) to the Internet in a rigorously scholarly project." Additionally, the project is designed to be utilized by a wide group of interested parties, including students, academics, and the general public. Visitors may peruse the documents by language of original publication, or by viewing a complete list of all the works currently available (many in HTML or pdf format) from the project's website. Some of the rather compelling works available here include the complete works of Oscar Wilde, the political writings of Michael Collins, and various historical documents regarding the struggle for Irish independence. [From The Scout Report, Copyright Internet Scout Project 1994-2003. http://scout.wisc.edu/]

Hyderabad IT Venture mulls raising its
corpus


Hyderabad IT Venture mulls raising its
corpus
04/11/2005 03:51 AM
ZDNet India Apr 11 2005 7:33AM GMT

Local Leaders to Meet In Corpus Christi,
Texas for Wi-Fi Summit


Local Leaders to Meet In Corpus Christi,
Texas for Wi-Fi Summit
08/05/2004 04:17 PM
Wi-Fi Technology Forum Aug 5 2004 8:30PM GMT

LoadPod brings local iPod loading
service to Corpus Christi, Texas


LoadPod brings local iPod loading
service to Corpus Christi, Texas
09/23/2004 02:44 AM
LoadPod, the first and only nationwide local iPod loading service, announced today that is has begun offering local service in Corpus Christi, Texas, and the entire 361 area code. [PRWEB Sep 23, 2004]

"Ray Bradbury, author of "A Bad Thing Is
Heading In This Direction" and
"Vocalizing the Electro-corpus" is
offended with Michael Moore for calling
his movie "Fahrenheit 9/11""


"Ray Bradbury, author of "A Bad Thing Is
Heading In This Direction" and
"Vocalizing the Electro-corpus" is
offended with Michael Moore for calling
his movie "Fahrenheit 9/11""
06/19/2004 04:26 PM

Notes and Tips: StatWorks Retrieval?


Notes and Tips: StatWorks Retrieval? 05/02/2004 11:08 AM
Anyone know how to retrieve some important old data in StatWorks format?

Perdition Mail Retrieval Proxy 1.12


Perdition Mail Retrieval Proxy 1.12 12/15/2003 05:59 AM
A POP3 and IMAP4 proxy.

Perdition Mail Retrieval Proxy 1.15


Perdition Mail Retrieval Proxy 1.15 05/27/2004 09:08 AM
A POP3 and IMAP4 proxy.

Distributed Generic Info Retrieval


Distributed Generic Info Retrieval 05/08/2004 03:39 AM
New Portal Release (0.95)

Music and Audio Retrieval Tools


Music and Audio Retrieval Tools 12/11/2003 07:23 PM
MaART version 20031211 released

RECOIN - Retrieval Component Integrator


RECOIN - Retrieval Component Integrator 08/16/2004 06:20 PM
RECOIN 0.2.9.2 released

LeanOnMe P2P backup/retrieval system
announced


LeanOnMe P2P backup/retrieval system
announced
08/19/2004 08:08 AM
312, Inc. has announced the release of LeanOnMe, a platform independent backup and remote access solution...

Native XML databases resolve XML
document retrieval issues


Native XML databases resolve XML
document retrieval issues
03/11/2003 01:22 AM
CNET Mar 10 2003 1:23AM ET

dtSearch Licenses Text Retrieval Engine
to NewHeights Software


dtSearch Licenses Text Retrieval Engine
to NewHeights Software
05/10/2004 11:09 PM
BC Technology May 11 2004 3:36AM GMT

Sr Perl/mod_Perl Software Engineer for
search/retrieval industry


Sr Perl/mod_Perl Software Engineer for
search/retrieval industry
05/23/2004 10:27 PM
CNET Search.com - United States, CA, San Francisco (2004-05-23)

Perl/mod_Perl Software Engineer for
search/retrieval industry


Perl/mod_Perl Software Engineer for
search/retrieval industry
05/23/2004 10:27 PM
CNET Search.com - United States, CA, San Francisco (2004-05-23)

Inbox Robot – Business & Competitive
Intelligence News Retrieval System


Inbox Robot – Business & Competitive
Intelligence News Retrieval System
11/14/2003 07:35 PM
Inbox Robot – Business & Competitive Intelligence News Retrieval System
http://www.inboxrobot.com/

The Inbox Robot is a news retrieval system that allows you to search thousands of news headlines and / or receive customized newsletters directly to your email inbox. You can choose any topic and you will always get fresh news. If you would like to know more or have any question visit their help section.

MDKSA-2004:051 - Updated mailman
packages fix password retrieval
vulnerability


MDKSA-2004:051 - Updated mailman
packages fix password retrieval
vulnerability
05/27/2004 03:22 PM
Mandrake Linux Security Team (May 26 2004)

GlueTheos: Automating the Retrieval and
Analysis of Data from Publicly Available
Software Repositories


GlueTheos: Automating the Retrieval and
Analysis of Data from Publicly Available
Software Repositories
09/22/2004 06:37 AM
GlueTheos: Automating the Retrieval and Analysis of Data from Publicly Available Software Repositories by Gregorio Robles, Jesús M. González-Barahona and Rishab A. Ghosh
http://opensource.mit.ed/papers/robles-barahona-ghosh_gluetheos.p df

Abstract by Authors:
For efficient, large scale data mining of publicly available information about libre (free, open source) software projects, automating the retrieval and analysis processes is a must. A system implementing such automation must have into account the many kinds of repositories with interesting information (each with its own structure and access methods), and the many kinds of analysis which can be applied to the retrieved data. In addition, such a system should be capable of interfacing and reusing as much existing software for both retrieving and analyzing data as possible. As a proof of concept of how that system could be, we started sometime ago to implement the GlueTheos system, featuring a modular,flexible architecture which has been already used in several of our studies of libre software projects. In this paper we show its structure, how it can be used, and how it can be extended.

Structure


Structure 04/10/2004 08:25 PM
Structure 0.0.5 released

TransTec Solutions Releases Its Medical
Transcription Management and Document
Retrieval Software


TransTec Solutions Releases Its Medical
Transcription Management and Document
Retrieval Software
06/23/2004 03:04 AM
After nearly three years of development and production trials, TransTec Solutions, today, announced the release of its transcription management and document retrieval software product, TranMan, designed for medical transcription businesses, hospitals, and group medical practices. “Having processed 20 million lines of transcription successfully, we’re definitely ready to go with the product,” says Dan Gaskin, founder of TransTec Solutions. [PRWEB Jun 23, 2004]

New PayPal Fee Structure


New PayPal Fee Structure 07/07/2004 07:33 PM
Business Knowledge Source Jul 7 2004 10:58PM GMT
Grok Description matches for Corpus Structure, Language Models, and Ad Hoc Information Retrieval
GrokA matches for Corpus Structure, Language Models, and Ad Hoc Information Retrieval

GMDH - Group Method of Data Handling


GMDH - Group Method of Data Handling 09/04/2004 06:23 AM
GMDH - Group Method of Data Handling
http://come.to/GMDH

Group Method of Data Handling was applied in a great variety of areas for data mining and knowledge discovery, forecasting and systems modeling, optimization and pattern recognition. Inductive GMDH algorithms give possibility to find automatically interrelations in data, to select optimal structure of model or network and to increase the accuracy of existing algorithms. This original self-organizing approach is substantially different from deductive methods used commonly for modeling. It has inductive nature - it finds the best solution by sorting-out of possible variants. By sorting of different solutions GMDH networks aims to minimize the influence of the author on the results of modeling. Computer itself finds the structure of the model and the laws which act in the system. Group Method of Data Handling is a set of several algorithms for different problems solution. It consists of parametric, clusterization, analogues complexing, rebinarization and probability algorithms. This has been added to Data Mining Resources Subject Tracer™ Information Blog and Knowledge Discovery Subject Tracer™ Information Blog.

mcl-algorithm 04-105


mcl-algorithm 04-105 05/10/2004 10:09 AM
A cluster algorithm for graphs.

mcl-algorithm 04-230


mcl-algorithm 04-230 08/17/2004 07:15 PM
A cluster algorithm for graphs.

Algorithm-SVM-0.11


Algorithm-SVM-0.11 04/12/2005 08:30 PM

Algorithm-FEC-0.5


Algorithm-FEC-0.5 06/21/2004 10:43 AM

Algorithm-SVM-0.08


Algorithm-SVM-0.08 06/09/2004 05:44 PM

mcl-algorithm 04-189


mcl-algorithm 04-189 07/07/2004 04:34 PM
A cluster algorithm for graphs.

"Code Access Security (CAS) ? "Guilty
until proven Innocent" (Partially
Trusted Code) "


"Code Access Security (CAS) ? "Guilty
until proven Innocent" (Partially
Trusted Code) "
06/22/2004 04:03 AM

Code Snippets: Store, sort and share
source code, with tag goodness


Code Snippets: Store, sort and share
source code, with tag goodness
04/08/2005 07:52 PM
Code Snippets: Store, sort and share source code, with tag goodness

bigbold.com/snippets
track this site | 5 links


"Code Snippets: Store, sort and share
source code, with tag goodness"


"Code Snippets: Store, sort and share
source code, with tag goodness"
04/09/2005 09:08 AM

BlockRanking algorithm


BlockRanking algorithm 02/13/2004 09:07 AM
le projet est ouvert

Algorithm-CheckDigits-0.36


Algorithm-CheckDigits-0.36 07/13/2004 11:40 PM

Algorithm-CheckDigits-0.37


Algorithm-CheckDigits-0.37 07/14/2004 05:11 PM

Algorithm-Networksort-1.04


Algorithm-Networksort-1.04 12/14/2003 06:21 PM

Algorithm-Dependency-1.0


Algorithm-Dependency-1.0 07/18/2004 05:18 PM

Algorithm-Merge-0.04


Algorithm-Merge-0.04 10/29/2003 09:13 AM

Algorithm-SetCovering-0.05


Algorithm-SetCovering-0.05 12/05/2003 05:37 AM

Algorithm-Merge-0.05


Algorithm-Merge-0.05 11/06/2003 06:17 PM

Algorithm-CheckDigits-0.33


Algorithm-CheckDigits-0.33 06/09/2004 05:44 PM

Algorithm-Cluster-1.27


Algorithm-Cluster-1.27 06/09/2004 05:44 PM

Algorithm-SkipList-1.02


Algorithm-SkipList-1.02 01/05/2005 01:55 AM

Algorithm-CheckDigits-0.32


Algorithm-CheckDigits-0.32 05/14/2004 04:41 PM

Algorithm-CheckDigits-0.34


Algorithm-CheckDigits-0.34 06/14/2004 05:52 PM

Yet Another C Chatting Algorithm


Yet Another C Chatting Algorithm 12/26/2004 01:16 AM
New AI

Algorithm-Dependency-0.5


Algorithm-Dependency-0.5 11/18/2003 06:40 PM

Algorithm-Dependency-0.6


Algorithm-Dependency-0.6 11/19/2003 08:10 AM

Algorithm-Interval2Prefix-0.02


Algorithm-Interval2Prefix-0.02 12/02/2003 11:27 PM

Algorithm-GDiffDelta-0.01


Algorithm-GDiffDelta-0.01 09/19/2004 12:17 AM

Algorithm-MedianSelect-0.01


Algorithm-MedianSelect-0.01 01/22/2004 10:18 AM

Algorithm-QuadTree-0.1


Algorithm-QuadTree-0.1 08/09/2004 05:12 PM

Corpus Structure, Language Models, and Ad Hoc Information Retrieval

The following phrases have been identified by the grok system as matching this entry: gmdh algorithm code

















Also check out:


Grok

Ipod Porn on the
Rise

Brief Abstract of
Wikipedia's
Mesothelioma Cancer
page

Get first aid
instructions in your
cell phone

IE is crap
JSPWiki gains
podcasting support

We'll pay, one way
or the other

This Week on
perl5-porters (10-16
May 2004)

Acer Aspire 1355XC
budget notebook

Marconi back in the
black

Baltimore directors
face second coup

AOL UK in sub
£20 broadband
offer

AT&T back on the
mobile road

Political Discord
Over Oil Supply
Increases Along With
Gas Prices (Los
Angeles Times)

Gandhi Won't Be
Premier (Los Angeles
Times)

Internet Dating Goes
Gray (Los Angeles
Times)

Bush Gains in
Efforts to Win Over
Jewish Vote (Los
Angeles Times)

3 Witnesses at Iraq
Abuse Hearing
Refused to Testify
(Los Angeles Times)

Morning-After Pill
to Be Offered in
Canada (AP)

Soldier to Go on
Trial in Prisoners
Abuse (AP)

3G MoU for Nokia
Testing for 3G Cell
Broadcast Services

New Standard Could
Reduce Spam

FBI Wants Answers on
Cisco Theft

Sharman Presses for
Evidence

Japanese to Be Next
Space Tourist

Embracing the Art of
Hacking

Mac SE Alive and
Kicking on Web

New Drill for
Tomorrow's Dentists

How to Get Gamers to
Play Online

US Election 2004:
Tech execs back Bush

Eminem suing Apple -
case has legs

Curse of the Net
Excuses Make Poor
Glue

Losing It
Pushing and Turning
Executives criticize
the tech industry

Cell switch rules
expand to entire
nation

Samsung Sees Chip
Shortages

Cellular replication
Rivals line up to
take on Office

Red Hat doffs cap at
latest Fedora

Google ups the email
storage ante - again

$7 Million Oders for
3G UMTS PC Card

Simulation speeds
military computer to
market

Wal-Mart: 'Smart'
tags test goes well

Applied Materials
Posts Profit on
Demand

AT&T Chooses Sprint
for Wireless Market
Return

Quieter mood for
Microsoft's CEO
summit

Tycoon Sugar heads
reality show

Disaster may have
killed ancients

Triple murder
suspect found dead

US in full review of
Afghan jails

Cricket: England
pair keen to lead

Fizzy drink link to
gullet cancer

Your reasons for
running NT 4

what is grok?