This data reports the publication output (number of articles and number of citations received) for a few scientists from the start of their career to 2000. Most of the variables are processed from the Microsoft Academic Graph (MAG) data set. A few variables are randomly generated.
Usage
data(base_pub, package = "fixest")
Format
base_pub
is a data frame with 4,024 observations and 10 variables. There are 200 different scientists and 51 different years (ends in 2000).
author_id
: scientist identifieryear
: current yearaffil_id
: affiliation ID of the scientist's current affiliationaffil_name
: affiliation name of the scientist's current affiliation (character)field
: field name of the scientist (character), time invariantnb_pub
: number of publications of the scientist for the current yearnb_cites
: number of citations received by the publications of the scientist in the current year. Accounts for the citations received from articles published up to 2020.birth_year
: birth year of the scientist (this is randomly generated)is_woman
: 1 if the scientist is a woman, 0 otherwise (this is randomly generated)age
: current age of the scientist (formallyyear - birth_year
)
Source
The source of this data set is the Microsoft Academic Graph data set, extracted in 2020. Now a defunct project, you can find similar data on OpenAlex.
The variables birth_year
, is_woman
and age
were randomly generated. All other variables have created from the raw MAG files.