logo

pleroma

My custom branche(s) on git.pleroma.social/pleroma/pleroma git clone https://hacktivis.me/git/pleroma.git
commit: cd316d7269a6cac1e9edb732b202343001b82399
parent 75f912c63f9a18e37f8ddbd403755b878467435a
Author: Ilja <ilja@ilja.space>
Date:   Mon, 14 Feb 2022 13:14:25 +0100

Use EXIF data of image to prefill image description

During attachment upload Pleroma returns a "description" field. Pleroma-fe has an MR to use that to pre-fill the image description field, <https://git.pleroma.social/pleroma/pleroma-fe/-/merge_requests/1399>

* This MR allows Pleroma to read the EXIF data during upload and return the description to the FE
    * If a description is already present (e.g. because a previous module added it), it will use that
    * Otherwise it will read from the EXIF data. First it will check -ImageDescription, if that's empty, it will check -iptc:Caption-Abstract
    * If no description is found, it will simply return nil, just like before
* When people set up a new instance, they will be asked if they want to read metadata and this module will be activated if so

This was taken from an MR i did on Pleroma and isn't finished yet.

Diffstat:

Mdocs/configuration/cheatsheet.md6++++++
Mdocs/installation/optional/media_graphics_packages.md15++++++++-------
Mlib/mix/tasks/pleroma/instance.ex26++++++++++++++++++++++++++
Mlib/pleroma/application_requirements.ex1+
Mlib/pleroma/upload.ex27++++++++++++++++++++-------
Alib/pleroma/upload/filter/exiftool_read_data.ex47+++++++++++++++++++++++++++++++++++++++++++++++
Atest/fixtures/portrait_of_owls_caption-abstract_tmp.jpg0
Atest/fixtures/portrait_of_owls_imagedescription_and_caption-abstract_tmp.jpg0
Atest/fixtures/portrait_of_owls_no_description_tmp.jpg0
Mtest/mix/tasks/pleroma/instance_test.exs7++++++-
Atest/pleroma/upload/filter/exiftool_read_data_test.exs106+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
11 files changed, 220 insertions(+), 15 deletions(-)

diff --git a/docs/configuration/cheatsheet.md b/docs/configuration/cheatsheet.md @@ -633,6 +633,12 @@ This filter only strips the GPS and location metadata with Exiftool leaving colo No specific configuration. +#### Pleroma.Upload.Filter.ExiftoolReadData + +This filter only reads metadata with Exiftool so clients can prefill the media description field. + +No specific configuration. + #### Pleroma.Upload.Filter.Mogrify * `args`: List of actions for the `mogrify` command like `"strip"` or `["strip", "auto-orient", {"implode", "1"}]`. diff --git a/docs/installation/optional/media_graphics_packages.md b/docs/installation/optional/media_graphics_packages.md @@ -1,9 +1,9 @@ # Optional software packages needed for specific functionality For specific Pleroma functionality (which is disabled by default) some or all of the below packages are required: - * `ImageMagic` - * `ffmpeg` - * `exiftool` + * `ImageMagic` + * `ffmpeg` + * `exiftool` Please refer to documentation in `docs/installation` on how to install them on specific OS. @@ -14,19 +14,20 @@ Note: the packages are not required with the current default settings of Pleroma `ImageMagick` is a set of tools to create, edit, compose, or convert bitmap images. It is required for the following Pleroma features: - * `Pleroma.Upload.Filters.Mogrify`, `Pleroma.Upload.Filters.Mogrifun` upload filters (related config: `Plaroma.Upload/filters` in `config/config.exs`) - * Media preview proxy for still images (related config: `media_preview_proxy/enabled` in `config/config.exs`) + * `Pleroma.Upload.Filters.Mogrify`, `Pleroma.Upload.Filters.Mogrifun` upload filters (related config: `Plaroma.Upload/filters` in `config/config.exs`) + * Media preview proxy for still images (related config: `media_preview_proxy/enabled` in `config/config.exs`) ## `ffmpeg` `ffmpeg` is software to record, convert and stream audio and video. It is required for the following Pleroma features: - * Media preview proxy for videos (related config: `media_preview_proxy/enabled` in `config/config.exs`) + * Media preview proxy for videos (related config: `media_preview_proxy/enabled` in `config/config.exs`) ## `exiftool` `exiftool` is media files metadata reader/writer. It is required for the following Pleroma features: - * `Pleroma.Upload.Filters.Exiftool` upload filter (related config: `Plaroma.Upload/filters` in `config/config.exs`) + * `Pleroma.Upload.Filters.Exiftool` upload filter (related config: `Plaroma.Upload/filters` in `config/config.exs`) + * `Pleroma.Upload.Filters.ExiftoolReadData` upload filter (related config: `Plaroma.Upload/filters` in `config/config.exs`) diff --git a/lib/mix/tasks/pleroma/instance.ex b/lib/mix/tasks/pleroma/instance.ex @@ -35,6 +35,7 @@ defmodule Mix.Tasks.Pleroma.Instance do listen_ip: :string, listen_port: :string, strip_uploads: :string, + read_uploads_data: :string, anonymize_uploads: :string, dedupe_uploads: :string ], @@ -178,6 +179,23 @@ defmodule Mix.Tasks.Pleroma.Instance do strip_uploads_default ) === "y" + {read_uploads_data_message, read_uploads_data_default} = + if Pleroma.Utils.command_available?("exiftool") do + {"Do you want to read data from uploaded files so clients can use it to prefill fields like image description? This requires exiftool, it was detected as installed. (y/n)", + "y"} + else + {"Do you want to read data from uploaded files so clients can use it to prefill fields like image description? This requires exiftool, it was detected as not installed, please install it if you answer yes. (y/n)", + "n"} + end + + read_uploads_data = + get_option( + options, + :read_uploads_data, + read_uploads_data_message, + read_uploads_data_default + ) === "y" + anonymize_uploads = get_option( options, @@ -230,6 +248,7 @@ defmodule Mix.Tasks.Pleroma.Instance do upload_filters: upload_filters(%{ strip: strip_uploads, + read_data: read_uploads_data, anonymize: anonymize_uploads, dedupe: dedupe_uploads }) @@ -304,6 +323,13 @@ defmodule Mix.Tasks.Pleroma.Instance do end enabled_filters = + if filters.read_data do + enabled_filters ++ [Pleroma.Upload.Filter.ExiftoolReadData] + else + enabled_filters + end + + enabled_filters = if filters.anonymize do enabled_filters ++ [Pleroma.Upload.Filter.AnonymizeFilename] else diff --git a/lib/pleroma/application_requirements.ex b/lib/pleroma/application_requirements.ex @@ -165,6 +165,7 @@ defmodule Pleroma.ApplicationRequirements do defp check_system_commands!(:ok) do filter_commands_statuses = [ check_filter(Pleroma.Upload.Filter.Exiftool, "exiftool"), + check_filter(Pleroma.Upload.Filter.ExiftoolReadData, "exiftool"), check_filter(Pleroma.Upload.Filter.Mogrify, "mogrify"), check_filter(Pleroma.Upload.Filter.Mogrifun, "mogrify"), check_filter(Pleroma.Upload.Filter.AnalyzeMetadata, "mogrify"), diff --git a/lib/pleroma/upload.ex b/lib/pleroma/upload.ex @@ -60,12 +60,23 @@ defmodule Pleroma.Upload do width: integer(), height: integer(), blurhash: String.t(), + description: String.t(), path: String.t() } - defstruct [:id, :name, :tempfile, :content_type, :width, :height, :blurhash, :path] - - defp get_description(opts, upload) do - case {opts[:description], Pleroma.Config.get([Pleroma.Upload, :default_description])} do + defstruct [ + :id, + :name, + :tempfile, + :content_type, + :width, + :height, + :blurhash, + :description, + :path + ] + + defp get_description(upload) do + case {upload.description, Pleroma.Config.get([Pleroma.Upload, :default_description])} do {description, _} when is_binary(description) -> description {_, :filename} -> upload.name {_, str} when is_binary(str) -> str @@ -81,7 +92,7 @@ defmodule Pleroma.Upload do with {:ok, upload} <- prepare_upload(upload, opts), upload = %__MODULE__{upload | path: upload.path || "#{upload.id}/#{upload.name}"}, {:ok, upload} <- Pleroma.Upload.Filter.filter(opts.filters, upload), - description = get_description(opts, upload), + description = get_description(upload), {_, true} <- {:description_limit, String.length(description) <= Pleroma.Config.get([:instance, :description_limit])}, @@ -152,7 +163,8 @@ defmodule Pleroma.Upload do id: UUID.generate(), name: file.filename, tempfile: file.path, - content_type: file.content_type + content_type: file.content_type, + description: opts.description }} end end @@ -172,7 +184,8 @@ defmodule Pleroma.Upload do id: UUID.generate(), name: hash <> "." <> ext, tempfile: tmp_path, - content_type: content_type + content_type: content_type, + description: opts.description }} end end diff --git a/lib/pleroma/upload/filter/exiftool_read_data.ex b/lib/pleroma/upload/filter/exiftool_read_data.ex @@ -0,0 +1,47 @@ +# Pleroma: A lightweight social networking server +# Copyright © 2017-2021 Pleroma Authors <https://pleroma.social/> +# SPDX-License-Identifier: AGPL-3.0-only + +defmodule Pleroma.Upload.Filter.ExiftoolReadData do + @moduledoc """ + Gets the description from the related EXIF tags and provides them in the response if no description is provided yet. + It will first check ImageDescription, when that's too long or empty, it will check iptc:Caption-Abstract. + When the description is too long (see `:instance, :description_limit`), an empty string is returned. + """ + @behaviour Pleroma.Upload.Filter + + @spec filter(Pleroma.Upload.t()) :: {:ok, any()} | {:error, String.t()} + + def filter(%Pleroma.Upload{description: description}) + when is_binary(description), + do: {:ok, :noop} + + def filter(%Pleroma.Upload{tempfile: file} = upload), + do: {:ok, :filtered, upload |> Map.put(:description, read_description_from_exif_data(file))} + + def filter(_, _), do: {:ok, :noop} + + defp read_description_from_exif_data(file) do + nil + |> read_when_empty(file, "-ImageDescription") + |> read_when_empty(file, "-iptc:Caption-Abstract") + end + + defp read_when_empty(current_description, _, _) when is_binary(current_description), + do: current_description + + defp read_when_empty(_, file, tag) do + try do + {tag_content, 0} = + System.cmd("exiftool", ["-b", "-s3", tag, file], stderr_to_stdout: true, parallelism: true) + + if tag_content != "" and + String.length(tag_content) <= + Pleroma.Config.get([:instance, :description_limit]), + do: tag_content, + else: nil + rescue + _ in ErlangError -> nil + end + end +end diff --git a/test/fixtures/portrait_of_owls_caption-abstract_tmp.jpg b/test/fixtures/portrait_of_owls_caption-abstract_tmp.jpg Binary files differ. diff --git a/test/fixtures/portrait_of_owls_imagedescription_and_caption-abstract_tmp.jpg b/test/fixtures/portrait_of_owls_imagedescription_and_caption-abstract_tmp.jpg Binary files differ. diff --git a/test/fixtures/portrait_of_owls_no_description_tmp.jpg b/test/fixtures/portrait_of_owls_no_description_tmp.jpg Binary files differ. diff --git a/test/mix/tasks/pleroma/instance_test.exs b/test/mix/tasks/pleroma/instance_test.exs @@ -69,6 +69,8 @@ defmodule Mix.Tasks.Pleroma.InstanceTest do "./test/../test/instance/static/", "--strip-uploads", "y", + "--read-uploads-data", + "y", "--dedupe-uploads", "n", "--anonymize-uploads", @@ -91,7 +93,10 @@ defmodule Mix.Tasks.Pleroma.InstanceTest do assert generated_config =~ "password: \"dbpass\"" assert generated_config =~ "configurable_from_database: true" assert generated_config =~ "http: [ip: {127, 0, 0, 1}, port: 4000]" - assert generated_config =~ "filters: [Pleroma.Upload.Filter.Exiftool]" + + assert generated_config =~ + "filters: [Pleroma.Upload.Filter.Exiftool, Pleroma.Upload.Filter.ExiftoolReadData]" + assert File.read!(tmp_path() <> "setup.psql") == generated_setup_psql() assert File.exists?(Path.expand("./test/instance/static/robots.txt")) end diff --git a/test/pleroma/upload/filter/exiftool_read_data_test.exs b/test/pleroma/upload/filter/exiftool_read_data_test.exs @@ -0,0 +1,106 @@ +# Pleroma: A lightweight social networking server +# Copyright © 2017-2021 Pleroma Authors <https://pleroma.social/> +# SPDX-License-Identifier: AGPL-3.0-only + +defmodule Pleroma.Upload.Filter.ExiftoolReadDataTest do + use Pleroma.DataCase, async: true + alias Pleroma.Upload.Filter + + @uploads %Pleroma.Upload{ + name: "portrait_of_owls_imagedescription_and_caption-abstract.jpg", + content_type: "image/jpeg", + path: + Path.absname("test/fixtures/portrait_of_owls_imagedescription_and_caption-abstract.jpg"), + tempfile: + Path.absname("test/fixtures/portrait_of_owls_imagedescription_and_caption-abstract_tmp.jpg"), + description: nil + } + + test "keeps description when not empty" do + uploads = %Pleroma.Upload{ + name: "portrait_of_owls_imagedescription_and_caption-abstract.jpg", + content_type: "image/jpeg", + path: + Path.absname("test/fixtures/portrait_of_owls_imagedescription_and_caption-abstract.jpg"), + tempfile: + Path.absname( + "test/fixtures/portrait_of_owls_imagedescription_and_caption-abstract_tmp.jpg" + ), + description: "Eight different owls" + } + + assert Filter.ExiftoolReadData.filter(uploads) == + {:ok, :noop} + end + + test "otherwise returns ImageDescription when present" do + uploads_after = %Pleroma.Upload{ + name: "portrait_of_owls_imagedescription_and_caption-abstract.jpg", + content_type: "image/jpeg", + path: + Path.absname("test/fixtures/portrait_of_owls_imagedescription_and_caption-abstract.jpg"), + tempfile: + Path.absname( + "test/fixtures/portrait_of_owls_imagedescription_and_caption-abstract_tmp.jpg" + ), + description: "Pictures of eight different owls" + } + + assert Filter.ExiftoolReadData.filter(@uploads) == + {:ok, :filtered, uploads_after} + end + + test "otherwise returns iptc:Caption-Abstract when present" do + upload = %Pleroma.Upload{ + name: "portrait_of_owls_caption-abstract.jpg", + content_type: "image/jpeg", + path: Path.absname("test/fixtures/portrait_of_owls_caption-abstract.jpg"), + tempfile: Path.absname("test/fixtures/portrait_of_owls_caption-abstract_tmp.jpg"), + description: nil + } + + upload_after = %Pleroma.Upload{ + name: "portrait_of_owls_caption-abstract.jpg", + content_type: "image/jpeg", + path: Path.absname("test/fixtures/portrait_of_owls_caption-abstract.jpg"), + tempfile: Path.absname("test/fixtures/portrait_of_owls_caption-abstract_tmp.jpg"), + description: "Pictures of eight different owls - iptc" + } + + assert Filter.ExiftoolReadData.filter(upload) == + {:ok, :filtered, upload_after} + end + + test "otherwise returns nil" do + uploads = %Pleroma.Upload{ + name: "portrait_of_owls_no_description-abstract.jpg", + content_type: "image/jpeg", + path: Path.absname("test/fixtures/portrait_of_owls_no_description.jpg"), + tempfile: Path.absname("test/fixtures/portrait_of_owls_no_description_tmp.jpg"), + description: nil + } + + assert Filter.ExiftoolReadData.filter(uploads) == + {:ok, :filtered, uploads} + end + + test "Return nil when image description from EXIF data exceeds the maximum length" do + clear_config([:instance, :description_limit], 5) + + assert Filter.ExiftoolReadData.filter(@uploads) == + {:ok, :filtered, @uploads} + end + + test "Return nil when image description from EXIF data can't be read" do + uploads = %Pleroma.Upload{ + name: "non-existant.jpg", + content_type: "image/jpeg", + path: Path.absname("test/fixtures/non-existant.jpg"), + tempfile: Path.absname("test/fixtures/non-existant_tmp.jpg"), + description: nil + } + + assert Filter.ExiftoolReadData.filter(uploads) == + {:ok, :filtered, uploads} + end +end