Format a string in Elm

I have a list of string and generate it to HTML dynamically with li tag. I want to assign that value to id attribute as well. But the problem is the string item has some special characters like :, ', é, ... I just want the output to include the number(0-9) and the alphabet (a-z) only.

// Input:
listStr = ["Pop & Suki", "PINK N' PROPER", "L'Oréal Paris"]

// Output: 
result = ["pop_suki", "pink_n_proper", "loreal_paris"] ("loral_paris" is also good)

Currently, I've just lowercased and replace " " to _, but don't know how to eliminate special character.

Many thanks!

2 answers

  • answered 2019-10-08 09:03 O.O.Balance

    Instead of thinking of it as eliminating special characters, consider the permitted characters – you want just lower-case alphanumeric characters.

    Elm provides Char.isAlphaNum to test for alphanumeric characters, and Char.toLower to transform a character to lower case. It also provides the higher function String.foldl which you can use to process a String one Char at a time.

    So for each character:

    • check if it's alphanumeric
    • if it is, transform it to lower case
    • if not and it is a space, transform it to an underscore
    • else drop the character

    Putting this together, we create a function that processes a character and appends it to the string processed so far, then apply that to all characters in the input string:

    transformNextCharacter : Char -> String -> String
    transformNextCharacter nextCharacter partialString =
        if Char.isAlphaNum nextCharacter then
            partialString ++ String.fromChar (Char.toLower nextCharacter)
        else if nextCharacter == ' ' then
            partialString ++ "_"
        else
            partialString
    
    transformString : String -> String
    transformString inputString =
        String.foldl transformNextCharacter "" inputString
    

    Online demo here.

    Note: This answer simply drops special characters and thus produces "loral_paris" which is acceptable as per the OP.

  • answered 2019-10-08 21:03 Kevin Ng

    The answer that was ticked is a lot more efficient than the code I have below. Nonetheless, I just want to add my code as an optional method.

    Nonetheless, if you want to change accents to normal characters, you can install and use the elm-community/string-extra package. That one has the remove accent method.

    This code below is efficient as you keep on calling library function on the same string of which all of them would go through your string one char at a time.

    Also, take note that when you remove the & in the first index you would have a double underscore. You would have to replace the double underscore with a single underscore.

    import Html exposing (text)
    import String
    import List
    import String.Extra
    import Char
    
    listStr = ["Pop & Suki", "PINK N' PROPER", "L'Oréal Paris"]
    
    -- True if alpha or digit or space, otherwise, False.
    isDigitAlphaSpace : Char -> Bool
    isDigitAlphaSpace c =
      if Char.isAlpha c || Char.isDigit c || c == ' ' then
        True
      else 
        False
    
    main =
      List.map (\x -> String.Extra.removeAccents x     --Remove Accents first
                   |> String.filter isDigitAlphaSpace  --Remove anything that not digit alpha or space
                   |> String.replace " " "_"           --Replace space with _  
                   |> String.replace "__" "_"          --Replace double __ with _
                   |> String.toLower) listStr          --Turn the string to lower
        |> Debug.toString
        |> Html.text