Dominant vectors of nonnegative matrices : application to information extraction in large graphs
Objects such as documents, people, words or utilities, that are related in some way, for instance by citations, friendship, appearance in definitions or physical connections, may be conveniently represented using graphs or networks. An increasing number of such relational databases, as for instance...
Main Author: | |
---|---|
Format: | Others |
Language: | en |
Published: |
Universite catholique de Louvain
2008
|
Subjects: | |
Online Access: | http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-02072008-145307/ |
id |
ndltd-BICfB-oai-ucl.ac.be-ETDUCL-BelnUcetd-02072008-145307 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-BICfB-oai-ucl.ac.be-ETDUCL-BelnUcetd-02072008-1453072013-01-07T15:42:05Z Dominant vectors of nonnegative matrices : application to information extraction in large graphs Ninove, Laure PageRank Information extraction Networks Graphs Cones Nonlinear iterations Perron-Frobenius Eigenvalue problems Dominant vectors Nonnegative matrices Objects such as documents, people, words or utilities, that are related in some way, for instance by citations, friendship, appearance in definitions or physical connections, may be conveniently represented using graphs or networks. An increasing number of such relational databases, as for instance the World Wide Web, digital libraries, social networking web sites or phone calls logs, are available. Relevant information may be hidden in these networks. A user may for instance need to get authority web pages on a particular topic or a list of similar documents from a digital library, or to determine communities of friends from a social networking site or a phone calls log. Unfortunately, extracting this information may not be easy. This thesis is devoted to the study of problems related to information extraction in large graphs with the help of dominant vectors of nonnegative matrices. The graph structure is indeed very useful to retrieve information from a relational database. The correspondence between nonnegative matrices and graphs makes Perron--Frobenius methods a powerful tool for the analysis of networks. In a first part, we analyze the fixed points of a normalized affine iteration used by a database matching algorithm. Then, we consider questions related to PageRank, a ranking method of the web pages based on a random surfer model and used by the well known web search engine Google. In a second part, we study optimal linkage strategies for a web master who wants to maximize the average PageRank score of a web site. Finally, the third part is devoted to the study of a nonlinear variant of PageRank. The simple model that we propose takes into account the mutual influence between web ranking and web surfing. Universite catholique de Louvain 2008-02-21 text application/pdf http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-02072008-145307/ http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-02072008-145307/ en unrestricted J'accepte que le texte de la thèse (ci-après l'oeuvre), sous réserve des parties couvertes par la confidentialité, soit publié dans le recueil électronique des thèses UCL. A cette fin, je donne licence à l'UCL : - le droit de fixer et de reproduire l'oeuvre sur support électronique : logiciel ETD/db - le droit de communiquer l'oeuvre au public Cette licence, gratuite et non exclusive, est valable pour toute la durée de la propriété littéraire et artistique, y compris ses éventuelles prolongations, et pour le monde entier. Je conserve tous les autres droits pour la reproduction et la communication de la thèse, ainsi que le droit de l'utiliser dans de futurs travaux. Je certifie avoir obtenu, conformément à la législation sur le droit d'auteur et aux exigences du droit à l'image, toutes les autorisations nécessaires à la reproduction dans ma thèse d'images, de textes, et/ou de toute oeuvre protégés par le droit d'auteur, et avoir obtenu les autorisations nécessaires à leur communication à des tiers. Au cas où un tiers est titulaire d'un droit de propriété intellectuelle sur tout ou partie de ma thèse, je certifie avoir obtenu son autorisation écrite pour l'exercice des droits mentionnés ci-dessus. |
collection |
NDLTD |
language |
en |
format |
Others
|
sources |
NDLTD |
topic |
PageRank Information extraction Networks Graphs Cones Nonlinear iterations Perron-Frobenius Eigenvalue problems Dominant vectors Nonnegative matrices |
spellingShingle |
PageRank Information extraction Networks Graphs Cones Nonlinear iterations Perron-Frobenius Eigenvalue problems Dominant vectors Nonnegative matrices Ninove, Laure Dominant vectors of nonnegative matrices : application to information extraction in large graphs |
description |
Objects such as documents, people, words or utilities, that are related in some way, for instance by citations, friendship, appearance in definitions or physical connections, may be conveniently represented using graphs or networks. An increasing number of such relational databases, as for instance the World Wide Web, digital libraries, social networking web sites or phone calls logs, are available. Relevant information may be hidden in these networks. A user may for instance need to get authority web pages on a particular topic or a list of similar documents from a digital library, or to determine communities of friends from a social networking site or a phone calls log. Unfortunately, extracting this information may not be easy.
This thesis is devoted to the study of problems related to information extraction in large graphs with the help of dominant vectors of nonnegative matrices. The graph structure is indeed very useful to retrieve information from a relational database. The correspondence between nonnegative matrices and graphs makes Perron--Frobenius methods a powerful tool for the analysis of networks.
In a first part, we analyze the fixed points of a normalized affine iteration used by a database matching algorithm. Then, we consider questions related to PageRank, a ranking method of the web pages based on a random surfer model and used by the well known web search engine Google. In a second part, we study optimal linkage strategies for a web master who wants to maximize the average PageRank score of a web site. Finally, the third part is devoted to the study of a nonlinear variant of PageRank. The simple model that we propose takes into account the mutual influence between web ranking and web surfing. |
author |
Ninove, Laure |
author_facet |
Ninove, Laure |
author_sort |
Ninove, Laure |
title |
Dominant vectors of nonnegative matrices : application to information extraction in large graphs |
title_short |
Dominant vectors of nonnegative matrices : application to information extraction in large graphs |
title_full |
Dominant vectors of nonnegative matrices : application to information extraction in large graphs |
title_fullStr |
Dominant vectors of nonnegative matrices : application to information extraction in large graphs |
title_full_unstemmed |
Dominant vectors of nonnegative matrices : application to information extraction in large graphs |
title_sort |
dominant vectors of nonnegative matrices : application to information extraction in large graphs |
publisher |
Universite catholique de Louvain |
publishDate |
2008 |
url |
http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-02072008-145307/ |
work_keys_str_mv |
AT ninovelaure dominantvectorsofnonnegativematricesapplicationtoinformationextractioninlargegraphs |
_version_ |
1716393627064729600 |