As you said yourself your goal is
to prevent malicious input
This is great that you already understand that you ALWAYS have to check user's input back-end. However, you do not write over 9000 validation rules. It nothing does with the language. You just use technique called escape a string.
The only exception is when you intentionally want to remove (strip) specific parts of user input like all HTML tags, emojis, whatever. Well, this depends on your business case and if you really want just to allow a-z, 0-9 and Chinese characters, you may use regex ranges for unicode itself.