Details - Machine Learning Risks – ManagedMethods

LET'S LEARN ABOUT ML RISKS

We have developed our own machine learning models to detect risk more accurately in your cloud environments. They are available to all customers and can be enabled by reaching out to support@managedmethods.com.

Self Harm ML:

The self harm ML risk is a continually improving way to flag content that indicates self harm. We trained it using massive data sets from various sources and are continually re-training it using both new material that we've found and content submitted by our customers using the "False Positive" reporting feature with the platform. Because of this, we are able to keep honing the accuracy of what is considered a risk and reduce the overall number of false positives. See THIS guide page for details on submitting false positives.

Image Risk ML:

Our new ML image risk is not a new concept. For years we used the google vision API for image detection and have since decided that it could be done better. Google vision is made for the masses and because of this, has a more broad definition of its flagging classifiers. Our in-house image detection engine is catered to our customer base and is continually being retrained using both new material that we've found and content submitted by our customers using the "False Positive" reporting feature with the platform. Because of this, we are able to keep honing the accuracy of what is considered a risk and reduce the overall number of false positives. See THIS guide page for details on submitting false positives.

(NOTE: Image Risk ONLY scans files when shared.)

Toxicity ML:

Our new ML Toxicity model is our latest and greatest way to detect toxic content within your environment. Our ML team has started with an already established toxicity model developed by industry leaders, and expanded upon it further to create what we feel is one of the best models available. The model will be looking for the following content types.

Identity Attacks: Attacks involving race, sexuality, gender, or other identities.
Insults: Content aiming an insult at an individual.
Obscene: Content containing sexually explicit, or vulgar language including profanity.
Threats: content containing threatening language.

See THIS guide page for details on submitting false positives.

Articles in this section

Details - Machine Learning Risks

Comments