(as)  [sysadmin] [blog]

User Tools

Site Tools


attachment_filter_pl

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
attachment_filter_pl [2016-06-16 21:45]
andreas Bug fix: Apparently parts_set is now required
attachment_filter_pl [2020-10-12 21:19]
andreas Discussion section archived + disabled commenting
Line 1: Line 1:
 +====== attachment_filter.pl ======
 +
 +A Perl script to remove attachments from e-mail messages
 +
 +===== Synopsis =====
 +
 +<code bash>
 +attachment_filter.pl < message
 +</code>
 +
 +===== Description =====
 +
 +''attachment_filter.pl'' reads a single email message from STDIN, parses it using [[http://search.cpan.org/perldoc?Email::MIME |Email::MIME]] and returns the email message at STDOUT. Attachments whose Content-Type do not match the regular expression in ''$allowed_attachments'' are replaced with a ''$replacement_text''.
 +
 +===== Script =====
 +
 +<file perl attachment_filter.pl>
 +#!/usr/bin/perl
 +#
 +# Script to remove attachments from mail messages using Email::MIME
 +#
 +# All attachments whose Content-Type does not match $allowed_attachments
 +# are removed and replaced with $replacement_text.
 +
 +# Version: 1.2 * 2016-06-16 (c) Andreas Schamanek
 +
 +# This script is distributed in the hope that it will be useful,
 +# but WITHOUT ANY WARRANTY. Use at your own risk!
 +#
 +# URL https://fam.tuwien.ac.at/~schamane/_/attachment_filter_pl
 +
 +use strict;
 +use warnings;
 +
 +# Email::MIME: http://search.cpan.org/perldoc?Email::MIME
 +# Debian package libemail-mime-perl
 +use Email::MIME;
 +
 +# Regex to match on allowed content-types (case-insensitive)
 +my $allowed_attachments = "^((text|message)/|application/(pkcs|pgp))";
 +
 +# Replacement text (must be us-ascii; or change the script)
 +my $replacement_text = "Attachment removed. Original Content-Type:\n  ";
 +
 +# Note: Variable $email is only used if $mail_was_modified = 1; so, if we 
 +# know that attachments are indeed stripped we could do without $email.
 +
 +# Global Variables
 +use vars qw(
 + $email $email_parsed $first_line
 + $mail_was_modified
 +);
 +
 +$mail_was_modified = 0;
 +
 +# Get email from standard input
 +$first_line = scalar <STDIN>;
 +# Read in the remainder
 +$email = join('', <STDIN>);
 +# Prepend $first_line unless it is a "From ..." line; in which case we 
 +# preserve it to work around some bugs
 +$email = $first_line . $email unless ($first_line =~ /^From .*\@/);
 +
 +# Parse email
 +$email_parsed = Email::MIME->new($email);
 +
 +# Debug MIME structure
 +#print STDERR "(debug) parsed structure:\n".$email_parsed->debug_structure;
 +
 +# Remove attachments that do not match /$allowed_attachments/i
 +$email_parsed->walk_parts(sub {
 + my ($part) = @_;
 + return if $part->subparts; # multipart
 +
 + if ($part->content_type && $part->content_type !~ /$allowed_attachments/i)
 + {
 + # record that the email has been modified
 + $mail_was_modified = 1;
 + # replacement text
 + $part->body_set($replacement_text . $part->content_type."\n");
 + $part->encoding_set('8bit');
 + $part->content_type_set('text/plain');
 + $part->charset_set('us-ascii');
 + }
 +});
 +
 +# replace email with stripped email
 +if ($mail_was_modified)
 +{
 + my @parts_new = $email_parsed->parts;
 + $email_parsed->parts_set( \@parts_new );
 + $email = $email_parsed->as_string;
 +};
 +
 +# output the message
 +print $first_line if ($first_line =~ /^From .*\@/);
 +print $email;
 +</file>
 +
 +===== Disclaimer =====
 +
 +I do not know much Perl. Suggestions for improvements are welcome!\\
 +Also, I haven't tested this much. Use at your own risk!
 +
 +===== Download =====
 +
 +  * [[http://www.fam.tuwien.ac.at/~schamane/_/_export/code/attachment_filter_pl?codeblock=1|attachment_filter.pl]]
 +
 +===== See also =====
 +
 +  * Blog post [[blog:130109_delete_mail_attachments|]] (2013-01-09)
 +
 +===== Discussion =====
 +
 +//Andreas Schamanek, 2016-08-23 15:13//
 +
 +''attchment_filter.pl'' has (probably) several shortcomings. One known bug is that if 1 part of a multipart/alternative is removed or changed, the multipart/alternative needs to be converted to e.g. multipart/mixed.
 +
  
attachment_filter_pl.txt · Last modified: 2020-10-12 21:19 by andreas