Dominant vectors of nonnegative matrices : application to information extraction in large graphs

Objects such as documents, people, words or utilities, that are related in some way, for instance by citations, friendship, appearance in definitions or physical connections, may be conveniently represented using graphs or networks. An increasing number of such relational databases, as for instance...

Full description

Bibliographic Details
Main Author: Ninove, Laure
Format: Others
Language:en
Published: Universite catholique de Louvain 2008
Subjects:
Online Access:http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-02072008-145307/
id ndltd-BICfB-oai-ucl.ac.be-ETDUCL-BelnUcetd-02072008-145307
record_format oai_dc
spelling ndltd-BICfB-oai-ucl.ac.be-ETDUCL-BelnUcetd-02072008-1453072013-01-07T15:42:05Z Dominant vectors of nonnegative matrices : application to information extraction in large graphs Ninove, Laure PageRank Information extraction Networks Graphs Cones Nonlinear iterations Perron-Frobenius Eigenvalue problems Dominant vectors Nonnegative matrices Objects such as documents, people, words or utilities, that are related in some way, for instance by citations, friendship, appearance in definitions or physical connections, may be conveniently represented using graphs or networks. An increasing number of such relational databases, as for instance the World Wide Web, digital libraries, social networking web sites or phone calls logs, are available. Relevant information may be hidden in these networks. A user may for instance need to get authority web pages on a particular topic or a list of similar documents from a digital library, or to determine communities of friends from a social networking site or a phone calls log. Unfortunately, extracting this information may not be easy. This thesis is devoted to the study of problems related to information extraction in large graphs with the help of dominant vectors of nonnegative matrices. The graph structure is indeed very useful to retrieve information from a relational database. The correspondence between nonnegative matrices and graphs makes Perron--Frobenius methods a powerful tool for the analysis of networks. In a first part, we analyze the fixed points of a normalized affine iteration used by a database matching algorithm. Then, we consider questions related to PageRank, a ranking method of the web pages based on a random surfer model and used by the well known web search engine Google. In a second part, we study optimal linkage strategies for a web master who wants to maximize the average PageRank score of a web site. Finally, the third part is devoted to the study of a nonlinear variant of PageRank. The simple model that we propose takes into account the mutual influence between web ranking and web surfing. Universite catholique de Louvain 2008-02-21 text application/pdf http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-02072008-145307/ http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-02072008-145307/ en unrestricted J'accepte que le texte de la thèse (ci-après l'oeuvre), sous réserve des parties couvertes par la confidentialité, soit publié dans le recueil électronique des thèses UCL. A cette fin, je donne licence à l'UCL : - le droit de fixer et de reproduire l'oeuvre sur support électronique : logiciel ETD/db - le droit de communiquer l'oeuvre au public Cette licence, gratuite et non exclusive, est valable pour toute la durée de la propriété littéraire et artistique, y compris ses éventuelles prolongations, et pour le monde entier. Je conserve tous les autres droits pour la reproduction et la communication de la thèse, ainsi que le droit de l'utiliser dans de futurs travaux. Je certifie avoir obtenu, conformément à la législation sur le droit d'auteur et aux exigences du droit à l'image, toutes les autorisations nécessaires à la reproduction dans ma thèse d'images, de textes, et/ou de toute oeuvre protégés par le droit d'auteur, et avoir obtenu les autorisations nécessaires à leur communication à des tiers. Au cas où un tiers est titulaire d'un droit de propriété intellectuelle sur tout ou partie de ma thèse, je certifie avoir obtenu son autorisation écrite pour l'exercice des droits mentionnés ci-dessus.
collection NDLTD
language en
format Others
sources NDLTD
topic PageRank
Information extraction
Networks
Graphs
Cones
Nonlinear iterations
Perron-Frobenius
Eigenvalue problems
Dominant vectors
Nonnegative matrices
spellingShingle PageRank
Information extraction
Networks
Graphs
Cones
Nonlinear iterations
Perron-Frobenius
Eigenvalue problems
Dominant vectors
Nonnegative matrices
Ninove, Laure
Dominant vectors of nonnegative matrices : application to information extraction in large graphs
description Objects such as documents, people, words or utilities, that are related in some way, for instance by citations, friendship, appearance in definitions or physical connections, may be conveniently represented using graphs or networks. An increasing number of such relational databases, as for instance the World Wide Web, digital libraries, social networking web sites or phone calls logs, are available. Relevant information may be hidden in these networks. A user may for instance need to get authority web pages on a particular topic or a list of similar documents from a digital library, or to determine communities of friends from a social networking site or a phone calls log. Unfortunately, extracting this information may not be easy. This thesis is devoted to the study of problems related to information extraction in large graphs with the help of dominant vectors of nonnegative matrices. The graph structure is indeed very useful to retrieve information from a relational database. The correspondence between nonnegative matrices and graphs makes Perron--Frobenius methods a powerful tool for the analysis of networks. In a first part, we analyze the fixed points of a normalized affine iteration used by a database matching algorithm. Then, we consider questions related to PageRank, a ranking method of the web pages based on a random surfer model and used by the well known web search engine Google. In a second part, we study optimal linkage strategies for a web master who wants to maximize the average PageRank score of a web site. Finally, the third part is devoted to the study of a nonlinear variant of PageRank. The simple model that we propose takes into account the mutual influence between web ranking and web surfing.
author Ninove, Laure
author_facet Ninove, Laure
author_sort Ninove, Laure
title Dominant vectors of nonnegative matrices : application to information extraction in large graphs
title_short Dominant vectors of nonnegative matrices : application to information extraction in large graphs
title_full Dominant vectors of nonnegative matrices : application to information extraction in large graphs
title_fullStr Dominant vectors of nonnegative matrices : application to information extraction in large graphs
title_full_unstemmed Dominant vectors of nonnegative matrices : application to information extraction in large graphs
title_sort dominant vectors of nonnegative matrices : application to information extraction in large graphs
publisher Universite catholique de Louvain
publishDate 2008
url http://edoc.bib.ucl.ac.be:81/ETD-db/collection/available/BelnUcetd-02072008-145307/
work_keys_str_mv AT ninovelaure dominantvectorsofnonnegativematricesapplicationtoinformationextractioninlargegraphs
_version_ 1716393627064729600