(as)  [sysadmin] [blog]

User Tools

Site Tools


attachment_filter_pl

attachment_filter.pl

A Perl script to remove attachments from e-mail messages

Synopsis

attachment_filter.pl < message

Description

attachment_filter.pl reads a single email message from STDIN, parses it using Email::MIME and returns the email message at STDOUT. Attachments whose Content-Type do not match the regular expression in $allowed_attachments are replaced with a $replacement_text.

Script

attachment_filter.pl
#!/usr/bin/perl
#
# Script to remove attachments from mail messages using Email::MIME
#
# All attachments whose Content-Type does not match $allowed_attachments
# are removed and replaced with $replacement_text.
 
# Version: 1.2 * 2016-06-16 (c) Andreas Schamanek
 
# This script is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY. Use at your own risk!
#
# URL https://fam.tuwien.ac.at/~schamane/_/attachment_filter_pl
 
use strict;
use warnings;
 
# Email::MIME: http://search.cpan.org/perldoc?Email::MIME
# Debian package libemail-mime-perl
use Email::MIME;
 
# Regex to match on allowed content-types (case-insensitive)
my $allowed_attachments = "^((text|message)/|application/(pkcs|pgp))";
 
# Replacement text (must be us-ascii; or change the script)
my $replacement_text = "Attachment removed. Original Content-Type:\n  ";
 
# Note: Variable $email is only used if $mail_was_modified = 1; so, if we 
# know that attachments are indeed stripped we could do without $email.
 
# Global Variables
use vars qw(
	$email $email_parsed $first_line
	$mail_was_modified
);
 
$mail_was_modified = 0;
 
# Get email from standard input
$first_line = scalar <STDIN>;
# Read in the remainder
$email = join('', <STDIN>);
# Prepend $first_line unless it is a "From ..." line; in which case we 
# preserve it to work around some bugs
$email = $first_line . $email unless ($first_line =~ /^From .*\@/);
 
# Parse email
$email_parsed = Email::MIME->new($email);
 
# Debug MIME structure
#print STDERR "(debug) parsed structure:\n".$email_parsed->debug_structure;
 
# Remove attachments that do not match /$allowed_attachments/i
$email_parsed->walk_parts(sub {
	my ($part) = @_;
	return if $part->subparts; # multipart
 
	if ($part->content_type && $part->content_type !~ /$allowed_attachments/i)
	{
		# record that the email has been modified
		$mail_was_modified = 1;
		# replacement text
		$part->body_set($replacement_text . $part->content_type."\n");
		$part->encoding_set('8bit');
		$part->content_type_set('text/plain');
		$part->charset_set('us-ascii');
	}
});
 
# replace email with stripped email
if ($mail_was_modified)
{
	my @parts_new = $email_parsed->parts;
	$email_parsed->parts_set( \@parts_new );
	$email = $email_parsed->as_string;
};
 
# output the message
print $first_line if ($first_line =~ /^From .*\@/);
print $email;

Disclaimer

I do not know much Perl. Suggestions for improvements are welcome!
Also, I haven't tested this much. Use at your own risk!

Download

See also

Discussion

Andreas Schamanek, 2016-08-23 15:13

attchment_filter.pl has (probably) several shortcomings. One known bug is that if 1 part of a multipart/alternative is removed or changed, the multipart/alternative needs to be converted to e.g. multipart/mixed.


Andreas Schamanek, 2023-02-01 20:31

textmail ("mail filter to replace MS Word/HTML attachments with plain text", also at github) is a Perl script that does more in a better way.

attachment_filter_pl.txt · Last modified: 2023-02-01 20:34 by andreas

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki