I check the attachments in the letter through the anti-virus system, if the attachment is malicious or suspicious, it must be removed from the letter and add the text that the attachment was deleted. But I ran into some difficulties, if the letter contains an inline attachment (for example, a picture embedded in the letter) then adding information to the end does not lead to anything, this information is simply not displayed by email clients when viewing the letter. I use the following letter (the contents of the image have been deleted):

To: nmikaev <nmikaev@tip.avsw.ru> From: nmikaev <nmikaev@tip.avsw.ru> Subject: test18 Message-ID: <5bf8356f-2cad-bb52-1644-344b25a6a3fe@tip.avsw.ru> Date: Wed, 13 Dec 2017 16:39:47 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="------------9D100498B76DCCA2302346E9" Content-Language: en-US This is a multi-part message in MIME format. --------------9D100498B76DCCA2302346E9 Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit test18 test18 --------------9D100498B76DCCA2302346E9 Content-Type: multipart/related; boundary="------------F4FDF24DEF3F140B04A4F455" --------------F4FDF24DEF3F140B04A4F455 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 7bit <html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> </head> <body text="#000000" bgcolor="#FFFFFF"> <p>test18</p> <p><img src="cid:part1.FFAD31E9.10EFE63D@tip.avsw.ru" alt=""></p> <p>test18</p> </body> </html> --------------F4FDF24DEF3F140B04A4F455 Content-Type: image/jpeg; name="oplpbkaaobmmkala.jpeg" Content-Transfer-Encoding: base64 Content-ID: <part1.FFAD31E9.10EFE63D@tip.avsw.ru> Content-Disposition: inline; filename="oplpbkaaobmmkala.jpeg" HERE IMAGE CONTENT --------------F4FDF24DEF3F140B04A4F455-- --------------9D100498B76DCCA2302346E9 Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Removed oplpbkaaobmmkala.jpeg with hash: 4b002716ede36b1f8da0ec3543cdb6996b50d943d1ba789a8fd39e3d467174e3 --------------9D100498B76DCCA2302346E9-- 

At the end of the information is added already. I use the following code:

 text = '' if self.attachments_pool: for attachment in self.attachments_pool: text+='Removed ' + attachment.getFilename() +' with hash: ' + attachment.getHash() + '\n' text = MIMEText(text, 'plain') self.msg.attach(text) 

However, if the letter does not contain inline attachments, then the information is correctly displayed. Perhaps the problem is that test18 is written twice in text / plain and in html. And between them there should be an image, it may be necessary to select the main object, which email.Message to add new text to. How to do it ? And I understand that the attach method is not quite suitable for these purposes?

And more, how can Content-Disposition: inline or Content-Disposition: attachment be effectively removed? Is it necessary to collect a new letter (new email.Message object) and add attachments to it already? You can probably somehow change the already existing object email.Message? And if it is impossible, how can you construct a new letter that can be of any complexity (you need to copy all the headers, all parts of the letter, for example, using msg.walk)?

Thank you very much in advance ! I hope to get an answer to my question.

  • one
    Another example was added to the multipart / alternative type of message. Of course, there are no guarantees that the client will show exactly your alternative . In general, it is necessary to form a new letter, replacing the type of container with the virus removed with a text one with diagnostics. Additionally This can all be wrapped into a new multipart / mixed letter, in which the first text part will contain a notification about the deleted parts, and the second message / rfc822 will contain, in the form of attachment, the original letter with the corrected parts. - Outtruder

1 answer 1

If a simple replacement for the text does not work in your case, then to replace the picture with your text, without tracking where cid is used ( for nested pictures in the letter ), you can generate your picture with the desired text and substitute it in the same place :

 import email.policy from email.mime.image import MIMEImage for part in msg.walk(): # msg is EmailMessage if part.get_content_maintype() == 'image': # generate replacement image text = "name: {}, sha1: {}".format( part.get_filename() or 'unknown', sha1(part.get_content())) subtype = part.get_content_subtype() mime_image = MIMEImage(generate_image_data(text, subtype), subtype, policy=email.policy.default) # replace part.set_payload(mime_image.get_payload()) del part['Content-Type'] # NOTE: part.clear_content() removes too much del part['Content-Transfer-Encoding'] del part['MIME-Version'] for k, v in mime_image.items(): # copy new Content-Type, etc headers part[k] = v 

The code looks for parts in the replacement letter (all the pictures here), then for each part generates a picture with the text containing the file name and sha1 hash and substitutes the picture, replacing the contents and copying the headers (to get the correct Content-Type, Content-Transfer-Encoding ).

Where:

 import hashlib def sha1(data): return hashlib.sha1(data).hexdigest() 

and:

 import io from PIL import Image, ImageDraw, ImageFont # $ pip install pillow def generate_image_data(text, type='png'): font = ImageFont.load_default() image = Image.new("RGB", font.getsize(text)) ImageDraw.Draw(image).text((0, 0), text, font=font) buffer = io.BytesIO() image.save(buffer, type) return buffer.getvalue()