There was a problem decoding the email header. If the encoding is quoted-printable, the encoded characters are translated into question marks. For example, =D1=80=D0=B5=D1=82=D0=B2=D0=B8=D1=82=D0=BD=D1=83=D0=BB(=D0=B0) instead of ретвитнул(а) translates in ??????????????????(??) , with the Base64 decoder displaying all the characters correctly.

  public static string DecodeEncodedLine(string text) { Regex regex = new Regex(@"\s*=\?(?<charset>.*?)\?(?<encoding>[qQbB])\?(?<value>.*?)\?="); string encoded = text; string decoded = string.Empty; while (encoded.Length > 0) { Match match = regex.Match(encoded); if (match.Success) { decoded += encoded.Substring(0, match.Index); string charset = match.Groups["charset"].Value; string encoding = match.Groups["encoding"].Value.ToUpper(); string value = match.Groups["value"].Value; if (encoding.Equals("B")) { var bytes = Convert.FromBase64String(value); decoded += Encoding.GetEncoding(charset).GetString(bytes); } else if (encoding.Equals("Q")) { Regex reg = new Regex(@"(\=([0-9A-F][0-9A-F]))", RegexOptions.IgnoreCase); decoded += reg.Replace(value, new MatchEvaluator(m => { byte[] bytes = new byte[m.Value.Length / 3]; for (int i = 0; i < bytes.Length; i++) { string hex = m.Value.Substring(i * 3 + 1, 2); int iHex = Convert.ToInt32(hex, 16); bytes[i] = Convert.ToByte(iHex); } return Encoding.GetEncoding(charset).GetString(bytes); })).Replace('_', ' '); } else { decoded += encoded; break; } encoded = encoded.Substring(match.Index + match.Length); } else { decoded += encoded; break; } } return decoded; } 

Also tried Attachment attachment = Attachment.CreateAttachmentFromString("", stringAttached); - The result is the same.

  • You have the wrong encoding. It looks like it should be UTF-8, then you decode it in ASCII - Zergatul
  • and how to fix it? - VUser
  • Encoding.GetEncoding(charset) , but I do not know where the charset should come from if it is not explicitly indicated in the line - Zergatul
  • =?UTF-8?Q?=D1=80=D0=B5 , the line has the following format - VUser
  • Look in the debugger, what kind of encoding is created there, or should I do it for you? - Zergatul

1 answer 1

You decode the bytes one by one, but you need to decode them in groups. Here is the corrected code for quoted-printable:

 Regex reg = new Regex(@"(\=(?<byte>[0-9A-F][0-9A-F]))+", RegexOptions.IgnoreCase); decoded += reg.Replace(value, new MatchEvaluator(m => { byte[] bytes = m.Groups["byte"].Captures.Cast<Capture>().Select(c => (byte)Convert.ToInt32(c.Value, 16)).ToArray(); return Encoding.GetEncoding(charset).GetString(bytes); })).Replace('_', ' ');