This data reports the publication output (number of articles and number of citations received) for a few scientists from the start of their career to 2000. Most of the variables are processed from the Microsoft Academic Graph (MAG) data set. A few variables are randomly generated.
Usage
data(base_pub, package = "fixest")Format
base_pub is a data frame with 4,024 observations and 10 variables. There are 200 different scientists and 51 different years (ends in 2000).
author_id: scientist identifieryear: current yearaffil_id: affiliation ID of the scientist's current affiliationaffil_name: affiliation name of the scientist's current affiliation (character)field: field name of the scientist (character), time invariantnb_pub: number of publications of the scientist for the current yearnb_cites: number of citations received by the publications of the scientist in the current year. Accounts for the citations received from articles published up to 2020.birth_year: birth year of the scientist (this is randomly generated)is_woman: 1 if the scientist is a woman, 0 otherwise (this is randomly generated)age: current age of the scientist (formallyyear - birth_year)
Source
The source of this data set is the Microsoft Academic Graph data set, extracted in 2020. Now a defunct project, you can find similar data on OpenAlex.
The variables birth_year, is_woman and age were randomly generated. All other variables have created from the raw MAG files.