Signal or Noise? WWW 2013
The 3rd Joint WICOW/AIRWeb Workshop on Web Quality
Rio de Janeiro, Brasil. May 13, 2013

News

WebQuality 2014 will be held at WWW 2014.

Objectives

The Web and social media are growing both in size and complexity, as well as playing an increasing role in our lives. Finding relevant, timely and trustworthy content in a sea of seemingly irrelevant chatter remains a challenging research issue. On one hand, this workshop deals with the more blatant and malicious attempts that deteriorate web quality such as spam, plagiarism, or various forms of abuse and ways to prevent them or neutralize their impact on users' experience. On the other hand, it will also provide a venue for exchanging ideas on quantifying and modeling issues of content quality, credibility and author reputation.

The objective of the workshop is to provide the research communities working on web quality topics with a survey of current problems and potential solutions. It presents an opportunity for close interaction between researchers and practitioners who may be focused on isolated sub-areas. We also want to gather crucial feedback for the academic community from participants representing major industry players on how web content quality research can contribute to practice.

Accepted Papers

  • "Defending Imitating Attacks in Web Credibility Evaluation Systems" (full paper) [slides] [paper]
    Xin Liu, Radoslaw Nielek, Adam Wierzbicki and Karl Aberer
  • "Cross-Lingual Web Spam Classification" (full paper) [paper]
    Andras Garzo, Balint Daroczy, Tamas Kiss, David Siklosi and Andras A. Benczur
  • "Russian web spam evolution: Yandex experience" (short paper) [slides] [paper]
    Sergey Pevtsov and Sergey Volkov
  • "Graph-based Malware Distributors Detection" (short paper) [slides] [paper]
    Andrei Venzhega, Polina Zhinalieva and Nikolay Suboch
  • "Quality-biased Ranking for Queries with Commercial Intent" (short paper) [slides] [paper]
    Alexander Shishkin, Polina Zhinalieva and Kirill Nikolaev
  • "On the Subjectivity and Bias of Web Content Credibility Evaluations" (full paper) [slides] [paper]
    Michal Kakol, Michal Jankowski-Lorek, Katarzyna Abramczuk, Adam Wierzbicki and Michele Catasta
  • "Trustworthiness criteria for supporting users to assess the credibility of Web Information" (full paper) [slides] [paper]
    Jarutas Pattanaphanchai, Kieron O'Hara and Wendy Hall
  • "Automatically Generated Spam Detection Based on Sentence-level Topic Information" (short paper) [paper]
    Yoshihiko Suhara, Hiroyuki Toda, Shuichi Nishioka and Seiji Susaki

Themes and Topics

The main themes of the workshop are that of evaluating web information credibility, and identifying and combating qualitatively extreme content (and related behavior), such as spam. These themes encompass a large set of often-related topics and subtopics, as listed below.

Assessing the credibility of content and people on the web and social media.

    Uncovering distorted and biased content
  • Detecting disagreement and conflicting opinions
  • Detecting disputed or controversial claims
  • Uncovering distorted or biased, inaccurate or false information
  • Uncovering common misconceptions and false beliefs
  • Search models and applications for finding factually correct information on the Web
  • Comparing authorized vs. unauthorized information (e.g. news article vs. readers' comments)
  • Comparing and evaluating online reviews, product or service testimonials
    Measuring quality of web content
  • Information quality and credibility of web search results, on social media sites, of online mass-media and news, and on the Web in general
  • Estimation of information age, provenance, validity, coverage, and completeness or depth
  • Formation, change, and evolution of opinions
  • Sociological and psychological aspects of information credibility estimation
  • Users studies of information credibility evaluation
    Modeling author identity, trust, and reputation
  • Estimating authors' and publishers' reputation
  • Evaluating authors' qualifications and credentials
  • Transparent ranking/reputation systems
  • Author intent detection
  • Capturing personal traits and sentiment
  • Modeling author identity, authorship attribution, and writing style
  • Systems for managing author identity on the Web
  • Revealing hidden associations between authors, commenters, reviewers, etc.
    Role of groups and communities
  • Role of groups, communities, and invisible colleges in the formation of opinions on the Web
  • Social-network-based credibility evaluation
  • Analysis of information dissemination on the Web
  • Common cognitive or social biases in user behavior (e.g., herd behavior)
  • Credibility in collaborative environments (e.g., on Wikipedia)
    Multimedia content credibility
  • Detecting deceptive manipulation or distortion of images and multimedia
  • Hiding content in images
  • Detecting incorrect labels or captions of images on the Web
  • Detecting mismatches between online images and the represented real objects
  • Credibility of online maps

Fighting spam, abuse, and plagiarism on the Web and social media

    Reducing web spam
  • Detecting various types of search engine spam (e.g., link spam, content spam, or cloaking)
  • Uncovering social network spam (e.g., serial sharing and lobbying) and spam in online media (e.g., blog, forum, wiki spam, or tag spam)
  • Identifying review and rating spam
  • Characterizing trends in spamming techniques
    Reducing abuses of electronic messaging systems
  • Detecting e-mail spam
  • Detecting spit (spam over internet telephony) and spim (spam over instant messenger)
    Detecting abuses in internet advertising
  • Click fraud detection
  • Measuring information credibility in online advertising and monetization
    Uncovering plagiarism and multiple-identity issues
  • Detecting plagiarism in general, and in web communities, social networks, and cross-language environments in particular
  • Identifying near-duplicate and versioned content of all kinds (e.g., text, software, image, music, or video)
  • High-similarity retrieval technologies (e.g., fingerprinting and similarity hashing)
    Promoting cooperative behavior in social networks
  • Monitoring vandalism, trolling, and stalking
  • Detecting fake friendship requests with spam intentions
  • Creating incentives for good behavior in social networks
  • User studies of misuse of the Web
    Security issues with online communication
  • Detecting phishing and identity theft
  • Flagging malware (e.g., viruses and spyware)
  • Web forensics

Other adversarial issues

  • Modeling and anticipating responses of adversaries to counter-measures
  • New web infringements
  • Web content filtering
  • Bypassing censorship on the Web
  • Blocking online advertisements
  • Reverse engineering of ranking algorithms
  • Stealth crawling

Program


[13:00 - 14:30] Web Content Quality Session:

  • "Defending Imitating Attacks in Web Credibility Evaluation Systems"
    Xin Liu, Radoslaw Nielek, Adam Wierzbicki and Karl Aberer
  • "Trustworthiness Criteria for Supporting Users to Assess the Credibility of Web Information"
    Jarutas Pattanaphanchai, Kieron O'Hara and Wendy Hall
  • "On the Subjectivity and Bias of Web Content Credibility Evaluations"
    Michal Kakol, Michal Jankowski-Lorek, Katarzyna Abramczuk, Adam Wierzbicki and Michele Catasta

[14:30 - 15:00] ** Coffee Break **

[15:00 - 16:30] Invited Talk:

[16:30 - 17:00] ** Coffee Break **

[17:00 - 18:30} Industry Experience Session:

  • "Russian Web Spam Evolution: Yandex Experience"
    Sergey Pevtsov and Sergey Volkov
  • "Graph-based Malware Distributors Detection"
    Andrei Venzhega, Polina Zhinalieva and Nikolay Suboch
  • "Quality-biased Ranking for Queries with Commercial Intent"
    Alexander Shishkin, Polina Zhinalieva and Kirill Nikolaev

[18:30 - 20:00] Web Spam Detection Session:

  • "Cross-Lingual Web Spam Classification"
    Andras Garzo, Balint Daroczy, Tamas Kiss, David Siklosi and Andras A. Benczur
  • "Automatically Generated Spam Detection Based on Sentence-level Topic Information"
    Yoshihiko Suhara, Hiroyuki Toda, Shuichi Nishioka and Seiji Susaki

Invited Talk

Title: "Web Search and Web Quality"

Speaker: Ricardo Baeza-Yates

Short bio: Ricardo Baeza-Yates is VP of Research for Europe and Latin America, leading the Yahoo! Research labs at Barcelona, Spain and Santiago, Chile, and also supervising the lab in Haifa, Israel. Until 2005 he was the director of the Center for Web Research at the Department of Computer Science of the Engineering School of the University of Chile; and ICREA Professor and founder of the Web Research Group at the Dept. of Information and Communication Technologies of Universitat Pompeu Fabra in Barcelona, Spain. His research interests includes algorithms and data structures, information retrieval, web data mining, and data visualization. He is ACM Fellow and IEEE Fellow.

Organizers

Adam Jatowt (Kyoto University)
Carlos Castillo (Qatar Computing Research Institute)
Zoltan Gyongyi (Google Research)
Katsumi Tanaka (Kyoto University)

PC Members:
Ching-man Au Yeung (Huawei Noah's Ark Lab)
Andras Benczur (Hungarian Academy of Sciences)
James Caverlee (Texas A&M University)
Kumar Chellapilla (Twitter)
Brian Davison (Lehigh University)
Dennis Fetterly (Microsoft)
Andrew Flanagin (University of California, Santa Barbara)
Pranam Kolari (Walmart Labs)
Panagiotis Metaxas (Wellesley College)
Miriam Metzger (University of California, Santa Barbara)
Meenali Rungta (Google)
Shazia Sadiq (University of Queensland)
Masashi Toyoda (University of Tokyo)
Steve Webb (Georgia Institute of Technology)
Baoning Wu (OpenX)

Contact

Emailadam [at] dl [dot] kuis [dot] kyoto-u [dot] ac [dot] jp
Phone+81-75-753-5909