Tuesday, October 11, 2022

[SOLVED] List and access files on owncloud public folder with R

Issue

I need to provide data to my students for using R in class. I uploaded the data on a public folder in owncloud. The link to the folder is public, without any password.

I can't figure out how to list all the links to each file, so that they can read it directly all of them.

So far I used:

r <- RCurl::getURL("https://server",verbose=FALSE, dirlistonly = TRUE)
XML::getHTMLLinks(r)

but the result is:

[1] "http://enable-javascript.com/"                                         
[2] "/owncloud/index.php"                                                   
[3] "https://server"
[4] ""                                                                      
[5] ""                                                                      
[6] "https://owncloud.org"                       

i.e. only the link at the top of the page, not the links to each file in the folder.

Any help is appreciated, thanks,

A


Solution

Ok, after more digging around (e.g. here), I found the solution. The trick is to use ownCloud's WebDAV service and specify API for public shares.

# Install required packages if you do not have them
# install.packages("xml2")
# install.packages("httr")


# specify your ownCloud provider
provider <- "https://owncloud.example.com" 

# specify your webDav endpoint (for publicly shared folders, do not include username)
# will most likely be "remote.php/dav" or "remote.php/webdav" 
webdav <- "remote.php/dav" 

# specify API for public links
api <- "public-files"

# specify sharing token portion of the URL
token <- "ToFnJDJKz27EQU" # just an example token

# construct URL
url <- paste(provider, webdav, api, token, sep = "/") 

# specify depth at > 1 if you want to track files in subfolders
depth <- 1

# run request
r <- httr::VERB(
    verb = "PROPFIND",
    url = url,
    httr::add_headers(depth = depth),
    httr::authenticate(token, "")
)

# parse result
x <- httr::content(r)
xml_links <- xml2::xml_find_all(x, ".//d:href")
partial_links <- xml2::xml_text(xml_links)

# get direct download links via webdav
links <- paste0(provider, partial_links)


Answered By - Radim
Answer Checked By - Candace Johnson (WPSolving Volunteer)