| Size: 5046 Comment:  | Size: 12378 Comment:  | 
| Deletions are marked like this. | Additions are marked like this. | 
| Line 1: | Line 1: | 
| #acl GiovambattistaIanni:read,write,admin,delete,revert All: | #acl GiovambattistaIanni:read,write,admin,delete,revert FrancescoCauteruccio:read,write GuestTester:read All: | 
| Line 13: | Line 13: | 
| Once installed, the Calculator works transparently when querying [[scholar.google.com|Google Scholar]]: as soon as you make a query, result pages are enriched with a number of useful data (e.g. the h-index computed on the basis of displayed data), and new functions are available. | Once installed, the Calculator works transparently when querying [[http://scholar.google.com|Google Scholar]]: as soon as you make a query, result pages are enriched with a number of useful data (e.g. the h-index computed on the basis of displayed data), and new functions are available. | 
| Line 38: | Line 38: | 
| === Author lists refinement === WORK IN PROGR This function allows to (semi)-automatically compute accurate normalized indices, overcoming the underestimate of 4 authors in case of multi-authored papers with 4+ co-authors. If Scholar Preferences are set to display Bibtex data URLs, the advanced interface displays a new control named Refine this author list per each paper. Given paper P, acting on its corresponding Refine this author list button will fill the P entry with its full list of authors, and displays the full name of the journal/conference of P (if data is available). Indices are automatically updated accordingly. It is also available a button named Refine all bibliographic entries, which will automatically perform the abovementioned refinement per each displayed paper. Be warned that refining all papers at once is an experimental feature and might make Scholar detect you as an automated software, subsequently asking for a captcha. | |
| Line 48: | Line 58: | 
| * 'none' : no normalization. The normalized citations of a paper corresponds to those displayed (after subtracting self citations). Same as the custom formula {{{citations-selfCitations}}}. | * 'none' : no normalization. The normalized citations of a paper correspond to those displayed (after subtracting self citations). Same as the custom formula {{{citations-selfCitations}}}. | 
| Line 50: | Line 60: | 
| * 'by age': if paper {{{i}}} has been cited {{{t}}} times, and has been written in {{{2001}}}, its number of normalized citations per age is {{{t/(CY-2001+1)}}}, for CY the current year. Same as the custom formula {{{(citations-selfCitations)/(year-thisYear+1)}}}. As an example, a paper scoring 100 citations and written in 2003, would score 10 normalized citations. | * 'by age': if paper {{{i}}} has been cited {{{t}}} times, and has been written in {{{2001}}}, its number of normalized citations per age is {{{t/(CY-2001+1)}}}, for CY the current year. Same as the custom formula {{{(citations-selfCitations)/(thisYear-year+1)}}}. As an example, a paper scoring 100 citations and written in 2003, would score 10 normalized citations in 2012. | 
| Line 54: | Line 64: | 
| ===== Normalization Language ===== | <<anchor(customformulas)>> ===== Custom Normalization Formulas Language ===== | 
| Line 56: | Line 67: | 
| You should be aware that normalization formulas are applied on per paper basis: your normalization formulas are intended to work in the context of a single paper. For a paper {{{i}}} a custom normalization formulas {{{f(i)}}} returns a number of citations, depending on how {{{f}}} behaves. A normalization formula can access the following attributes of the paper {{{i}}}: | You should be aware that normalization formulas are applied on per paper basis: your normalization formulas are intended to work in the context of a single paper. For a paper {{{i}}} a custom normalization formula {{{f(i)}}} returns a number of citations, depending on how {{{f}}} behaves. A normalization formula can access the following attributes of the paper {{{i}}}: | 
| Line 58: | Line 69: | 
| * {{{citations}}} : the number of citations for {{{i}}}} (as it appears on video) * {{{year}}}: the year of publication of {{{i}}} (as it appears on video. Conventionally set to '0' if not present). * {{{authors}}}}: the number of authors of {{{i}}}. This is estimated as it appears on video, and can be manually edited by clicking on the Authors field or acting on the 'Auth+' and 'Auth-' buttons for the paper 'i'. See the [[#refinement|Author Refinement]] section. * {{{selfCitations}}}: the number of self citations of {{{i}}}, as they appear in the 'Self Citation' editable text field. Defaults to 0. * {{{cleanCitations}}: a shortcut for {{{(citations-selfCitations)}}}. * {{{age}}} : a shortcut for {{{(year-thisYear+1)}}}. | * {{{citations}}} : the number of citations for {{{i}}} (as it appears on video). * {{{year}}}: the year of publication of {{{i}}} (as it appears on video. Conventionally set to '-100,000,000,000,000' if not present). * {{{authors}}}: the number of authors of {{{i}}}. This is estimated as it appears on video, and can be manually edited by clicking on the Authors field or acting on the 'Auth+' and 'Auth-' buttons for the paper 'i'. See the [[#refinement|Author Refinement]] section. * {{{selfCitations}}}: the number of self citations of {{{i}}}, as they appear in the 'Self Citations' editable text field. Defaults to 0. * {{{cleanCitations}}}: a shortcut for {{{(citations-selfCitations)}}}. * {{{age}}} : a shortcut for {{{(thisYear-year+1)}}}. * {{{thisYear}}} : current year, as of your PC's wall clock. | 
| Line 65: | Line 77: | 
| You should enable the Advanced interface first. | Allowed symbols: * {{{+}}}, {{{-}}}, {{{/}}}, {{{*}}}, {{{^}}}, {{{(}}}, {{{)}}}, with intuitive meaning ({{{^}}} is exponentiation). The square root of {{{x}}} can be easily obtained as {{{x^0.5}}}. Some further examples: * [[http://arxiv.org/abs/1106.0114v1|Carbone]]'s normalization: {{{citations/(authors^0.5)}}}. * [[http://arxiv.org/abs/cs.DL/0607066|hc-index]](delta,gamma) : {{{gamma*citations/age^delta}}} (replace gamma and delta with your favourite values) * Combined age and author weighting: {{{citations/age/authors}}} Custom formulas are visible only when the Advanced interface is enabled. | 
| Line 69: | Line 92: | 
| Indices correspond to '''columns''' in the Calculator information box. They correspond to a bibliometric index computed on the basis of a given set of papers. Besides the [[#defaultindices|default indices]] you can add your own. ===== Custom index formulas Language ===== You should be aware that indices formulas are applied to a sorted list of papers (usually the list of entries displayed on video, sorted by the normalization at hand). For a sorted set of papers {{{S}}} a custom index formula {{{f(S)}}} returns an index value, depending on how {{{f}}} behaves. An index formula can access the attributes of all the papers of the corpus. ''Per each row in the information box, the corresponding normalization function is applied beforehand, and papers on video are preliminarily sorted according to their number of normalized citations'': that is, {{{f}}} is computed repeatedly for each normalization available. The language available for custom index formulas is much richer than the normalization language. Constructs available are listed next. In the following, assume a sorted list of papers {{{S}}}, and a normalization function {{{n(i)}}}, for {{{i}}} denoting the {{{i}}}-th paper of S are given. Special arrays: * {{{citations[x]}}} : the number of normalized citations for the {{{x}}}-th paper in S (i.e. {{{n(x)}}}) * {{{year[x]}}}: the year of publication of the {{{x}}}-th paper in S. * {{{age[x]}}}: a shortcut for {{{(thisYear-year[x]+1)}}} * {{{authors[x]}}}: the number of authors of the {{{x}}}-th paper in S. * {{{selfCitations[x]}}}: the number of self citations of the {{{x}}}-th paper in S (not normalized). * {{{plainCitations[x]}}}: the number of citations for the {{{x}}}-th paper in S, without any normalization applied. {{{x}}} can be any allowed formula. Special symbols: * {{{N}}} : the number of papers in S. * {{{h}}}, {{{g}}}, {{{e}}}, {{{deltaH}}}, {{{deltaG}}} : the value of the respective indices, obtained according to the citation normalization at hand. * {{{thisYear}}} : current year, as of your PC's wall clock. Functions: '''Aggregate''' functions are available: these come in two possible forms: {{{ funcName(start,end,variable,expression) }}} or {{{ funcName(start,end,variable,booleanExpression,expression) }}} Where {{{start}}} and {{{end}}} are expressions denoting respectively the numeric range which {{{variable}}} will sweep on; {{{variable}}} is an identifier of choice, which is allowed to appear in {{{expression}}}. A {{{booleanExpression}}} is in the form {{{expr relOp expr}}} where {{{relOp}}} can be one among {{{<}}},{{{>}}},{{{>=}}},{{{<=}}},{{{==}}},{{{!=}}}. Currently available aggregate functions are {{{min}}}, {{{max}}}, {{{sum}}} and {{{prod}}}. In order to exemplify how aggregates work, assume to have a set of 5 papers with respectively 10, 6, 4, 2 and 1 citations. Then {{{ min(1,N,i,citations[i]) = 1 max(1,N,i,citations[i]) = 10 sum(1,N,i,citations[i]) = 23 }}} Boolean expressions can be used to select which papers should be filtered out in the aggregate function. For instance, the Google My Citations i10-index (the number of publications with at least 10 citations) is {{{ sum(1,N,i,citations[i] >= 10,1) }}} As in normalization formulas, allowed expression comprehend {{{+}}}, {{{-}}}, {{{/}}}, {{{*}}}, {{{^}}}, {{{(}}}, {{{)}}}, with intuitive meaning. Some examples: * Equivalent impact of the Top-10 articles: {{{sum(1,10,i,citations[i])^0.5}}} * e-index : {{{sum(1,h,i,citations[i])-h^2}}} * AR-index: {{{sum(1,h,i,citations[i])^0.5}}} (corresponds to the AR index when citations are normalized by age). * Citations in the last five years: {{{sum(1,N,i,age[i] <= 5,citations[i])}}} | |
| Line 72: | Line 160: | 
| == Release Notes == | == Release Notes and history == May-Jun 2012. 3.0 Release with many new features: * Possibility to add custom normalization and indices formulas see the [[#customformulas|Custom Formulas]] section * 'Refine author list' and 'Refine all bibliographic entries' functions now much more accurate (can correctly extract lists of thousands of authors in most cases) * Can now compute h-index values greater than 100 * Support for the new Scholar Modern look * Many bug fixes and internal code optimization Jan 13th, 2012. 2.3.5 Adapted to Google Scholar page layout changes. Other minor bug fixing. May 31th, 2011. 2.3 Improved layout. Added indices normalized per age. Apr 26th, 2011. 2.2 Authors names are now clickable and point to the corresponding query on Scholar. Feb 24th, 2011. 2.1 New advanced refinement button per paper. Other minor improvements and bug fixing. Oct 23th, 2010. 2.0 Introduction of the advanced fine-tuning interface. Minor bug fixing and accuracy improvement in border cases. Nov 26th, 2009. 1.4 and 1.4.1 Added radio buttons for selecting the query type on the fly. Improved author parsing. Nov 21th, 2009. 1.3.3 Added on hover tooltips. Nov 19th, 2009. 1.3.2 Improved appearance. Added link to the same query with 100 results if h-index and g-index fail to compute properly. Nov 17th, 2009. 1.3 Introduced normalized indices and support for CiteSeerX. Oct 26th, 2009. 1.2.3 Introduced checks for overflow of h,g,e,deltah,deltag values. Oct 21th, 2009. 1.2.1 A minor fix on delta-H computation. Oct 20th, 2009. 1.2. Introduced delta-H and delta-G. Minor fixes. Oct 16th, 2009. 1.1.1. Fix bug on e-index computation. Oct 11th, 2009. 1.1. Fix bug related to Google Scholar not strictly respecting citation descending order. Fix some citation values evaluated as NaN. | 
Scholar H-Index Calculator - Home page
About Scholar H-Index Calculator
Scholar H-Index Calculator (the Calculator from now on) is an addon for Firefox which enhances Google Scholar results pages by showing a number of bibliometric data computed using the data appearing on video as input.
Once installed, the Calculator works transparently when querying Google Scholar: as soon as you make a query, result pages are enriched with a number of useful data (e.g. the h-index computed on the basis of displayed data), and new functions are available.
The Team
Project Coordinator
- Giovambattista Ianni (Code designing, writing, reviewing, maintainance and refactoring) 
Developers Team
- Francesco Cauteruccio (Aggressive refining engine), Susanna Cozza (General code maintainance), Stefano Germano (Custom formulas parser), Maria Carmela Santoro (General code refactoring, Additional results browsing code). 
You can contact us at shi_AT_mat.unical.it (replace _AT_ with a '@' to obtain our mail address).
Download
- Beta 3.0 (Coming soon).
- Official Scholar H-Index Calculator page at Mozilla. 
Documentation and examples
Author lists refinement
WORK IN PROGR This function allows to (semi)-automatically compute accurate normalized indices, overcoming the underestimate of 4 authors in case of multi-authored papers with 4+ co-authors. If Scholar Preferences are set to display Bibtex data URLs, the advanced interface displays a new control named Refine this author list per each paper. Given paper P, acting on its corresponding Refine this author list button will fill the P entry with its full list of authors, and displays the full name of the journal/conference of P (if data is available). Indices are automatically updated accordingly.
It is also available a button named Refine all bibliographic entries, which will automatically perform the abovementioned refinement per each displayed paper. Be warned that refining all papers at once is an experimental feature and might make Scholar detect you as an automated software, subsequently asking for a captcha.
Custom formula editing
As of Calculator 3.0, there is the possibility for users to add their own bibliometric formulas and display their outcome next to default indices. There are two types of custom formulas: Normalizations and Indices.
Normalizations
In the Calculator information box, each row shows bibliometric indices depending on a given Normalization. Each normalization weighs citations of each paper depending on a given criterion. Three are the default normalizations:
- 'none' : no normalization. The normalized citations of a paper correspond to those displayed (after subtracting self citations). Same as the custom formula citations-selfCitations. 
- 'by authors': the citations of each paper are normalized by the (estimated) number of authors. Same as the custom formula (citations-selfCitations)/authors. For instance a paper with 100 citations and 4 authors, will score a number of normalized citations of 25. The number of authors cannot be always estimated correctly unless the refinement function is used. You might want to read the Author Refinement section about how the Calculator estimates the number of authors per each paper. 
- 'by age': if paper i has been cited t times, and has been written in 2001, its number of normalized citations per age is t/(CY-2001+1), for CY the current year. Same as the custom formula (citations-selfCitations)/(thisYear-year+1). As an example, a paper scoring 100 citations and written in 2003, would score 10 normalized citations in 2012. 
You can add your own normalization formulas by clicking on the button 'New normalization' on the bottom of the Information box. Two editable textfield will appear. Enter the normalization name in the leftmost field and your custom formulas in the rightmost. Click anywhere else when ready, and if your formula is correct, you should see a new row in which all the available indices are computed according to your new normalization notion. Enjoy!
<<anchor(customformulas)>>
Custom Normalization Formulas Language
You should be aware that normalization formulas are applied on per paper basis: your normalization formulas are intended to work in the context of a single paper. For a paper i a custom normalization formula f(i) returns a number of citations, depending on how f behaves. A normalization formula can access the following attributes of the paper i:
- citations : the number of citations for i (as it appears on video). 
- year: the year of publication of i (as it appears on video. Conventionally set to '-100,000,000,000,000' if not present). 
- authors: the number of authors of i. This is estimated as it appears on video, and can be manually edited by clicking on the Authors field or acting on the 'Auth+' and 'Auth-' buttons for the paper 'i'. See the Author Refinement section. 
- selfCitations: the number of self citations of i, as they appear in the 'Self Citations' editable text field. Defaults to 0. 
- cleanCitations: a shortcut for (citations-selfCitations). 
- age : a shortcut for (thisYear-year+1). 
- thisYear : current year, as of your PC's wall clock. 
Allowed symbols:
- +, -, /, *, ^, (, ), with intuitive meaning (^ is exponentiation). The square root of x can be easily obtained as x^0.5. 
Some further examples:
- Carbone's normalization: citations/(authors^0.5). 
- hc-index(delta,gamma) : gamma*citations/age^delta (replace gamma and delta with your favourite values) 
- Combined age and author weighting: citations/age/authors 
Custom formulas are visible only when the Advanced interface is enabled.
Indices
Indices correspond to columns in the Calculator information box. They correspond to a bibliometric index computed on the basis of a given set of papers. Besides the default indices you can add your own.
Custom index formulas Language
You should be aware that indices formulas are applied to a sorted list of papers (usually the list of entries displayed on video, sorted by the normalization at hand). For a sorted set of papers S a custom index formula f(S) returns an index value, depending on how f behaves. An index formula can access the attributes of all the papers of the corpus. Per each row in the information box, the corresponding normalization function is applied beforehand, and papers on video are preliminarily sorted according to their number of normalized citations: that is, f is computed repeatedly for each normalization available. The language available for custom index formulas is much richer than the normalization language. Constructs available are listed next.
In the following, assume a sorted list of papers S, and a normalization function n(i), for i denoting the i-th paper of S are given.
Special arrays:
- citations[x] : the number of normalized citations for the x-th paper in S (i.e. n(x)) 
- year[x]: the year of publication of the x-th paper in S. 
- age[x]: a shortcut for (thisYear-year[x]+1) 
- authors[x]: the number of authors of the x-th paper in S. 
- selfCitations[x]: the number of self citations of the x-th paper in S (not normalized). 
- plainCitations[x]: the number of citations for the x-th paper in S, without any normalization applied. 
x can be any allowed formula.
Special symbols:
- N : the number of papers in S. 
- h, g, e, deltaH, deltaG : the value of the respective indices, obtained according to the citation normalization at hand. 
- thisYear : current year, as of your PC's wall clock. 
Functions:
Aggregate functions are available: these come in two possible forms:
funcName(start,end,variable,expression)
or
funcName(start,end,variable,booleanExpression,expression)
Where start and end are expressions denoting respectively the numeric range which variable will sweep on; variable is an identifier of choice, which is allowed to appear in expression. A booleanExpression is in the form expr relOp expr where relOp can be one among <,>,>=,<=,==,!=.
Currently available aggregate functions are min, max, sum and prod. In order to exemplify how aggregates work, assume to have a set of 5 papers with respectively 10, 6, 4, 2 and 1 citations. Then
    min(1,N,i,citations[i]) = 1
    max(1,N,i,citations[i]) = 10
    sum(1,N,i,citations[i]) = 23Boolean expressions can be used to select which papers should be filtered out in the aggregate function. For instance, the Google My Citations i10-index (the number of publications with at least 10 citations) is
sum(1,N,i,citations[i] >= 10,1)
As in normalization formulas, allowed expression comprehend +, -, /, *, ^, (, ), with intuitive meaning.
Some examples:
- Equivalent impact of the Top-10 articles: sum(1,10,i,citations[i])^0.5 
- e-index : sum(1,h,i,citations[i])-h^2 
- AR-index: sum(1,h,i,citations[i])^0.5 (corresponds to the AR index when citations are normalized by age). 
- Citations in the last five years: sum(1,N,i,age[i] <= 5,citations[i]) 
Release Notes and history
May-Jun 2012. 3.0 Release with many new features:
- Possibility to add custom normalization and indices formulas see the Custom Formulas section 
- 'Refine author list' and 'Refine all bibliographic entries' functions now much more accurate (can correctly extract lists of thousands of authors in most cases)
- Can now compute h-index values greater than 100
- Support for the new Scholar Modern look
- Many bug fixes and internal code optimization
Jan 13th, 2012. 2.3.5 Adapted to Google Scholar page layout changes. Other minor bug fixing. May 31th, 2011. 2.3 Improved layout. Added indices normalized per age. Apr 26th, 2011. 2.2 Authors names are now clickable and point to the corresponding query on Scholar. Feb 24th, 2011. 2.1 New advanced refinement button per paper. Other minor improvements and bug fixing. Oct 23th, 2010. 2.0 Introduction of the advanced fine-tuning interface. Minor bug fixing and accuracy improvement in border cases. Nov 26th, 2009. 1.4 and 1.4.1 Added radio buttons for selecting the query type on the fly. Improved author parsing. Nov 21th, 2009. 1.3.3 Added on hover tooltips. Nov 19th, 2009. 1.3.2 Improved appearance. Added link to the same query with 100 results if h-index and g-index fail to compute properly. Nov 17th, 2009. 1.3 Introduced normalized indices and support for CiteSeerX.
Oct 26th, 2009. 1.2.3 Introduced checks for overflow of h,g,e,deltah,deltag values. Oct 21th, 2009. 1.2.1 A minor fix on delta-H computation. Oct 20th, 2009. 1.2. Introduced delta-H and delta-G. Minor fixes. Oct 16th, 2009. 1.1.1. Fix bug on e-index computation. Oct 11th, 2009. 1.1. Fix bug related to Google Scholar not strictly respecting citation descending order. Fix some citation values evaluated as NaN.
Related Work
