Issue
There are similar questions to this but I haven't found anything that directly relates to my issue. I am a little new to PHP and curl, so bear with me and thanks in advance.
Description: I have a php application that is uploading files to Sharepoint 2019 rest api. It works for all file types so far, except .doc and .docx format files. Those files get successfully posted, but once downloaded and opened I get errors such as:
"Word found unreadable content in {filename}.docx". Do you want to recover the contents of this document? If you trust the source of this document, click yes."
If I click yes, the file opens without problems. If I download the file directly from the Sharepoint Site, it has the same problem. How do I pass a docx file to rest api with curl? It would seem that there is some encoding issue, but I am not sure how to tell which side its on since sharepoint doesn't have any problems with the upload it tells me about. The other article I found on stack overflow breaks the data apart, but that is for docusign rest api and is from 2013. Found Here. Do I need to break up the data on my calls as well?
Below is my Code for File uploads
$files = $_FILES;
$local_file = $_FILES['input_document_upload'];
$fileName = $local_file['name'];
//I am assuming there is something with the encoding for curl_file_create below I am missing
$data = array(
'uploaded_file' => curl_file_create($local_file['tmp_name'], $local_file['type'], $fileName)
);
$client_upload_url = //ends in _api/web/lists/getbytitle('{documentFolder}')/rootfolder/files/add
$client_upload_url .= "(url='". $fileName ."',overwrite=true)";
$curl = curl_init();
curl_setopt_array($curl, array(
CURLOPT_URL => $client_upload_url,//<-- no problem here since it uploads correctly 99% of the time
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => "",
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 30,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_NONE ,
CURLOPT_POSTFIELDS=> $data,
CURLOPT_CUSTOMREQUEST => "POST",
CURLOPT_HTTPHEADER => array(
"Accept: application/json;odata=verbose",
"cache-control: no-cache",
"X-RequestDigest: " . $digest_value,
//hardcoded the below type, but I have used several different content-type settings to try to get this working
//multipart/form-data
//application/octet-stream
"Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"Authorization: //redacted
),
));
$response = curl_exec($curl);
Thanks for your Time!
Solution
Well after much experimentation and searching I figured out the answer for this problem. The biggest culprit is indeed the content type. If you use the following code:
$data = array(
'uploaded_file' => curl_file_create($local_file['tmp_name'], $local_file['type'], $fileName)
);
/// redacted for space
CURLOPT_POSTFIELDS=> $data,
curl will automatically strip out any content-type you give it and supply its own. you get the following headers:
Content-Type: multipart/form-data; boundary=----------637571310612295910
Content-Length: 12184
------------637571310612295910
Content-Disposition: form-data; name="uploaded_file"; filename="{filename}.docx"
Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Sharepoint does not like this at all. So, you need to send binary data, not multipart/form-data. This can be achieved like so:
$uploadFile = file_get_contents($local_file['tmp_name']);
$curl = curl_init();
curl_setopt_array($curl, array(
CURLOPT_URL => $client_upload_url,
CURLOPT_RETURNTRANSFER => true,
CURLOPT_ENCODING => "",
CURLOPT_MAXREDIRS => 10,
CURLOPT_TIMEOUT => 30,
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_NONE ,
CURLOPT_CUSTOMREQUEST => "POST",
CURLOPT_POSTFIELDS=> $uploadFile, //<-- where the magic happens
CURLOPT_HTTPHEADER => array(
"Accept: application/json;odata=verbose",
"cache-control: no-cache",
"X-RequestDigest: " . $digest_value,
"Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document",//
"Authorization: {values}
),
));
This will net you a result like follows
Accept: application/json; odata=verbose
Cache-Control: no-cache
X-RequestDigest:{redacted}
Authorization: {redacted}
Connection: Keep-Alive
Request-Id: |1bf3eca4-45702011fc30c20b.2.
Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Content-Length: 11947
{raw body here, not multipart/formdata}
moral of the story is file_get_contents will get you the binary data as a string. Which you can dump directly in the CURLOPT_POSTFIELDS.
The inspiration for this was astonishingly from a post from 2008 found here
Answered By - user15984297