Package 'mediacloudr'

Title: Wrapper for the 'mediacloud.org' API
Description: API wrapper to gather news stories, media information and tags from the 'mediacloud.org' API, based on a multilevel query <https://mediacloud.org/>. A personal API key is required.
Authors: Dix Jan [cre, aut]
Maintainer: Dix Jan <[email protected]>
License: MIT + file LICENSE
Version: 0.1.1.9000
Built: 2024-10-31 03:19:24 UTC
Source: https://github.com/jandix/mediacloudr

Help Index


Extract meta data

Description

extract_meta_data extracts native, open graph and twitter meta data from html documents. The meta data include url, title, description and image. The html document is parsed within the function

Usage

extract_meta_data(html_doc)

Arguments

html_doc

Character string including the html document.

Value

List with three sublists for native, open graph and twitter.

Examples

## Not run: 
 library(httr)
 url <- "https://bits.blogs.nytimes.com/2013/04/07/the-potential-and-the-risks-of-data-science"
 response <- GET(url)
 html_document <- content(response, type = "text", encoding = "UTF-8")
 meta_data <- extract_meta_data(html_doc = html_document)

## End(Not run)

Get media by id

Description

get_media returns media source by their id. A media source is one publisher. Every story that can be collected via get_story or get_story_list belongs to one media source.

Usage

get_media_source(media_id, api_key = Sys.getenv("MEDIACLOUD_API_KEY"))

Arguments

media_id

Positive integer that contains a valid media“ id.

api_key

Character string with the API key you get from mediacloud.org. Passing it is compulsory. Alternatively, function can be provided from the global environment.

Value

Data frame with results. See https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/api_2_0_spec.md#media for field descriptions.

Examples

## Not run: 
 media_source <- get_media_source(media_id = 604L)

## End(Not run)

Get story by id

Description

get_story returns news stories by their id. One story represents one online publication. Each story refers to a single URL from any feed within a single media source.

Usage

get_story(story_id, api_key = Sys.getenv("MEDIACLOUD_API_KEY"))

Arguments

story_id

Positive numeric that contains a valid story id.

api_key

Character string with the API key you get from mediacloud.org. Passing it is compulsory. Alternatively, function can be provided from the global environment.

Value

Data frame with results. See https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/api_2_0_spec.md#stories for field descriptions.

Examples

## Not run: 
 story <- get_story(story_id = 604L)

## End(Not run)

Get story list

Description

get_story returns a list of stories based on a multifaceted query. One story represents one online publication. Each story refers to a single URL from any feed within a single media source.

Usage

get_story_list(last_process_stories_id = 0L, rows = 100,
  feeds_id = NULL, q = NULL, fq = NULL,
  sort = "processed_stories_id", wc = FALSE, show_feeds = FALSE,
  api_key = Sys.getenv("MEDIACLOUD_API_KEY"))

Arguments

last_process_stories_id

Return stories in which the processed_stories_id is greater than this value.

rows

Number of stories to return, max 1000.

feeds_id

Return only stories that match the given feeds_id, sorted my descending publish date

q

If specified, return only results that match the given Solr query. Only one q parameter may be included.

fq

If specified, file results by the given Solr query. More than one fq parameter may be included.

sort

Returned results sort order. Supported values: processed_stories_id, random

wc

If set to TRUE, include a 'word_count' field with each story that includes a count of the most common words in the story

show_feeds

If set to TRUE, include a 'feeds' field with a list of the feeds associated with this story

api_key

Character string with the API key you get from mediacloud.org. Passing it is compulsory. Alternatively, function can be provided from the global environment.

Value

Data frame with results. See https://github.com/berkmancenter/mediacloud/blob/master/doc/api_2_0_spec/api_2_0_spec.md#stories for field descriptions.

Examples

## Not run: 
 stories <- get_story_list()
 stories <- get_story_list(q = "Trump")

## End(Not run)

HTML document to test extract_meta_data

Description

A HTML document with basic meta tags for open-graph, twitter and native meta data.

Usage

meta_data_html

Format

An object of class character of length 1.