Vectorial Model
Also known as vector space model. Within this model, the documents are represented using a vector in which the relationships between the document and its characteristics.
To get the features that help the formation of the vector are used occurrences found some significant words within the text.
With this data is performed representation vector to be used in the consultations to retrieve the information. The way to retrieve the information is comparing this vector to vector documents. It uses a feature of similarity. The degree of similarity varies with the consultation to be carried out. The greater the degree it is felt that best fits the application.
With this model can be obtained documents in an orderly manner and can limit the number of results if it is considered a minimum degree of similarity.
Boolean Model
This is one of the models for information retrieval simpler than known. It is based on Boolean algebra and the set theory. If the vector model turned consultations with an array of features, this model creates a boolean expression to formalise the consultation. This expression uses Boolean operators AND, OR and NOT.
To retrieve information, a document will have more relevance than another taking into account whether a word is present or not is to say:
- If there is the word: it contains.
- If there are two words: word1 AND palabra2.
- If there is a yes and no other: word1 AND NOT palabra2.
- If you find yourself or one or the other: word1 OR palabra2.
These combinations vary depending on the number of keywords.
Depending on the Boolean operators join the keywords, you retrieve some documents or other, because it is not the same look word1 AND palabra2 (has to be both) to look word1 OR palabra2 (appears or one or the other).
The problem with this model is that if you have a series of documents, not known as the sort that has relevance each. To fix this can be extended using the Boolean model which adds weight to the words searched what led him to approach a vector model.
This page has been developed for one Computer Engineering subject of Carlos III University of Madrid, specifically, Recovery and Access of Information.
Versions available:
Topics made:
Unsupervised Information Extraction and Retrieval
Usability and accessibility in the positioning and information retrieval
Also of interest:
Retrieval motors of XML/RDF documents
Retrieval y organization of information
Process Language for Information Retrieval
Metadatas and XML/RDF documents for retrieval
Retrieval and Organizing Information
Extraction information whith supervised clasification
Organizing information whith unsupervised clasification
Retrieval Motors of XML/RDF documents