3
I Use This!
Inactive

News

Analyzed about 11 hours ago. based on code collected 1 day ago.
Posted about 8 years ago
In my last post, I covered the speech rate problem as I perceived it. Understanding the theory of what is going on is half the work. I decided to make a cross-platform speech rate benchmark that would allow me to identify how well each platform ... [More] conforms to rate changes. So how do you measure speech rate? I guess you can count words per minute. But in reality, all we want is relative speech rate. The duration of each utterance at given rates should give us a good reference for the relative rate. Is it perfect? Probably not. Each speech engine may treat the time between words, and punctuation pauses differently at different rates, but it will give us a good picture. For example. If it takes a speech service α seconds to speak an utterance, it should take it α/2 seconds to to say the same thing with a rate of 2.0, or 2α at a rate of 0.5. If we want to measure the engines rate compliance across a set of different rates, we first get the utterance time with a “normal” rate of 1.0, and then the rest is simple division. I wrote a tool to do just this. Play around with it. It is fun. Speech Rates in OSX As I mentioned in the previous post, it looks like our OSX speech support is the only platform where we actually got it right. Running the rates benchmark on the Alex voice gives us these results: Voice 1/2.5 1/2.25 1/2 1/1.75 1/1.5 1/1.25 1 1.25 1.5 1.75 2 2.25 2.5 Alex 0.415 0.459 0.521 0.593 0.688 0.826 1.00 1.27 1.52 1.75 1.99 2.23 2.48 You know what would be prettier than a table? A graph! So I made the benchmark generate a graph that gives a good visualization of the rate distribution. I normalized it with a log10 so that you can get a better idea of distances in the graph. Here is what it looks like for a rate-conforming voice like Alex: Rate distribution of Alex voice on OSX OSX voices aren’t perfect. For example, the Hebrew Carmit Voice does not support rates under 0.5 (or above 3, but that is not shown here). If a rate out of the supported range is given, the voice will just go back to its default. That’s not too good! There may be some workarounds to make sure we at least use the max/min rate, but I didn’t check it out yet. Voice 1/2.5 1/2.25 1/2 1/1.75 1/1.5 1/1.25 1 1.25 1.5 1.75 2 2.25 2.5 Carmit 0.926 0.927 0.520 0.584 0.686 0.818 1.00 1.27 1.54 1.75 2.04 2.30 2.53 Carmit rate distribution It could even be worse. OSX has a voice named “Bad News”. Apparently, there is only one rate in which to deliver bad news: Voice 1/2.5 1/2.25 1/2 1/1.75 1/1.5 1/1.25 1 1.25 1.5 1.75 2 2.25 2.5 Bad News 0.882 0.899 0.917 0.937 0.956 0.978 1.00 1.02 1.03 1.04 1.05 1.05 1.06 Bad News rate distribution With a few exceptions, speech rate on OSX is pretty good. Their API and documentation is straightforward enough that we nailed this on the first shot. Speech Rates in Windows As mentioned in the previous post, speech rate in SAPI is defined as an integer from -10 to 10. In our initial implementation, we naively assumed that we can take the Web Speech API rate multiplier, and convert it to an SAPI rate by using the 10th root of the value and multiplying it by ten. In effect, projecting the maximal distribution of SAPI speech rates across the allowed Web Speech API values: 0.1 would become -10, 10 would be 10. With my fancy new benchmark tool, we could get a good picture of where that leads us with the standard David voice: Voice 1/2.5 1/2.25 1/2 1/1.75 1/1.5 1/1.25 1 1.25 1.5 1.75 2 2.25 2.5 MS David 0.712 0.713 0.712 0.799 0.892 0.976 1.00 0.977 1.13 1.25 1.41 1.40 1.40 That’s not too good. For example, if you provide a rate of 2.0, you would expect double the normal rate, but you only get 1.41x. Here is a graph: Broken rate distribution on Windows I also mentioned in the previous post, that after digging in MSDN, I found a page that explains what the expected speech engine characteristics are. In essence, 10 in SAPI means three times the normal speed, and -10 is one third. That is something to work with! I fixed the math we do for the rate conversion, and once we correctly projected our multiplier values into what SAPI accepts we got much better benchmark results: Voice 1/2.5 1/2.25 1/2 1/1.75 1/1.5 1/1.25 1 1.25 1.5 1.75 2 2.25 2.5 MS David 0.413 0.454 0.509 0.568 0.713 0.799 1.00 1.25 1.40 1.76 1.97 2.19 2.41 Fixed Windows rate distribution Speech Rates in Linux In Linux, we originally had the same naive conversion that we had in Windows, with the only diff being that in Speech Dispatcher the rate is an integer from -100 to 100. The results were just as bad: Voice 1/2.5 1/2.25 1/2 1/1.75 1/1.5 1/1.25 1 1.25 1.5 1.75 2 2.25 2.5 default 0.659 0.684 0.712 0.757 0.804 0.887 1.00 1.01 1.07 1.07 1.14 1.15 1.20 Broken rate distribution on Linux Speech Dispatcher has limited documentation. But luckily, it is open source, so we just need to read the code, right? I guess. Theoretically. But in this case, it just made me go down a rabbit hole. Speech Dispatcher supports a few engines, but in reality most distributions make it hard to use anything that isn’t eSpeak. I looked at the eSpeak speechd configuration file, and it had some interesting numbers. The “normal” rate is 170 eSpeak words per minute. The max is 390, and the min is 80. In a proportional sense that would be a min rate of 0.47, and a max of 2.3. Adjusting the math in our Speech Dispatcher adapter to those theoretical values made things much better, but not perfect. Reading the code of speechd, and its eSpeak output module didn’t reveal anything else. Instead of relying on bad documentation and fallible “source code”, I decided to just go with the numbers I was seeing in the benchmark. Speech Dispatcher rates max out at 2.5x and won’t go below 1/2x. I put those numbers into our conversion formula, and.. it worked well! Voice 1/2.5 1/2.25 1/2 1/1.75 1/1.5 1/1.25 1 1.25 1.5 1.75 2 2.25 2.5 default 0.488 0.487 0.487 0.567 0.678 0.812 1.00 1.29 1.47 1.76 1.96 2.14 2.39 Fixed rate distribution on Linux Conclusion What’s the lesson to be learned? Not sure. Sometimes the docs are good, other times less so, and sometimes not at all? Benchmarks and real-world results are important? The one certain thing I know now is that graphs are cool. This is what the rate distribution was before I fixed these issues, check it out: Rate distribution on all three platforms before fix And look how clean and nice it is now: Fixed rate distribution on all three platforms       [Less]
Posted about 8 years ago
While working on the speech synthesis API in Firefox, I have been trying to figure out how to provide the most consistent experience across different desktop platforms. This is tricky, because each platform has its own speech API. Each API has ... [More] slightly differing feature sets and idiosyncrasies. A good way to foresee the difficulties others will encounter when writing a cross-platform speech app, is to actually write one. So when we started work on the Narrate feature, I needed to account for the fact that the Linux speech API is not capable of pausing speech mid-utterance. We managed to design the interface in a way that wouldn’t require pausing speech, but still give the user an intuitive way of stopping and starting the narration mid-way through the article. When we landed Narrate, Ehsan noticed that the speech rate slider in the interface was useless at either extremes. Either silly fast, or glacially slow. Since I was developing this feature on Linux, I didn’t encounter the fast rates OSX users experienced. I first thought this was a bug in our OSX speech synthesis support, but later realized Mac was the exception, and this was an issue with the rate conversion in Linux and Windows. Speech rate is subjective, in so many ways. What is a “normal” speech rate? I have found many numbers, ranging from 150 to 200 words per minute for US English speakers. Luckily, when it comes to the web API we don’t care what the “normal” rate is, we care about relative rate. But that doesn’t completely solve our conundrum. The Web Speech API (and SSML) defines the rate as a multiplier of the normal speech rate for a given voice. Meaning, 1 is normal, 2 is twice the speed and 0.5 is half. The speech rate goes through a few conversions: Web page provides a rate to the browser The browser converts the rate to the platform-specific speech API. The speech API converts the rate to the speech engine API. In some cases, the voice converts the rate value the engine provides it. In OSX, life of course is easiest. The rate is defined as words per minute, their docs say “normal” is between 180 to 220. All we need to do is multiply that, and we get a predictable rate. Things get a little hairier on Linux and Windows. In Linux, the Speech Dispatcher API docs says that rate should be an integer between -100 and 100 with 0 being default. In Windows, the situation is not much better: from a cursory glance at the docs for ISpVoice::SetRate, it just specifies that accepted rates are between -10 and 10. Further digging brought up more information on Linux and Windows that made the rate parameter a bit more understandable: In Linux, Speech Dispatcher’s eSpeak output module configuration file shows that the default rate is 160 wpm, the minimum is 80, and the max is 320. That’s pretty conservative, but at least it gives us an idea of what is going on. Deeply buried in the Windows docs, I found a mention that the max is 3x normal, and the minimum is a third normal rate. Now we have something to work with! Did I bore you? I am boring myself.. Next post, a speech rate benchmark tool, pretty graphs, and my attempt at “math”!   [Less]
Posted about 8 years ago
Reader View in Firefox makes reading articles, stories and blog posts enjoyable. It removes the noisy background ads and graphics, and gives you a clean single column optimized for your reading pleasure. As of today’s Nightly build, you will find an ... [More] extra button in the Reader View toolbar: the Narrate button. Press play in the popup, and you will have the page read out aloud.  You are now free to give your eyes a rest, knit, wash dishes, work out, play Candy Crush, whatever.   At Mozilla, we believe the web must remain open and accessible. Accessibility can mean many things. In our accessibility team, we work to make Firefox usable to users with disabilities. Disability is not a binary, it is more nuanced than that. We define our users broadly, we don’t divide them into users with and without disabilities. There can be many reasons why you would choose to click play on that Narrate popup: eye fatigue, multi-tasking, dyslexia, or Angry Birds. With features like Narrate, we want to make the web more accessible and convenient for everybody. [Less]
Posted about 8 years ago
For the last three years I have had the opportunity to send out a reminder to Mozilla staff that Martin Luther King Jr. Day is coming up, and that U.S. employees get the day off. It has turned into my MLK Day eve ritual. I read his letters, listen to ... [More] speeches, and then I compose a belabored paragraph about Dr. King with some choice quotes. If you didn’t get a chance to celebrate Dr. King’s legacy and the movements he was a part of, you still have a chance: Watch Selma. Watch The Black Power Mixtape (it’s on Netflix). Read A Letter from a Birmingham Jail (it’s really really good). Listen to his speech Beyond Vietnam. Listen to his last speech I Have Been To The Mountaintop. [Less]
Posted over 8 years ago
When we first made Firefox accessible for Android, the majority of our users were using Gingerbread. Accessibility in Android was in its infancy, and certain things we take for granted in mobile screen readers were not possible. For example, swiping ... [More] and explore by touch were not available in TalkBack, the Android screen reader. The primary mode of interaction was with the directional keys of the phone, and the only accessibility event was “focus”. The bundled Android apps were hardly accessible, so it was a challenge to make a mainstream, full featured, web browser accessible to blind users. Firefox for Android at the time was undergoing a major overhaul, so it was a good time to put some work into our own screen reader support. We were governed by two principals when we designed our Android accessibility solution: Integrate with the platform, even if it is imperfect. Don’t require the user to learn anything new. The screen reader was less than intuitive. If the user jumped through the hoops to learn how to use the directional pad, they endured enough. Don’t force them through additional steps. Don’t make them install addons or change additional settings. Introduce new interaction modes through progressive enhancements. As long as the user could use the d-pad, we could introduce other features that may make users even happier. There are a number of examples of where we did this: As early as our gingerbread support, we had keyboard quick navigation support. Did your phone have a keyboard? If so, you could press “h” to jump to the next heading instead of arrowing all the way down. When Ice Cream Sandwich introduced explore by touch, we added swipe left/right to get to previous or next items. We also added 3 finger swipe gestures to do quick navigation between element types. This feature got mixed feedback: It is hard to swipe with 3 fingers horizontally on a 3.5″ phone. It was a real source of pride to have the most accessible and blind-friendly browser on Android. Since our initial Android support our focus has gone elsewhere while we continued to maintain our offering. In the meantime, Google has upped its game. Android has gotten a lot more sophisticated on the accessibility front, and Chrome integrated much better with TalkBack (I would like to believe we inspired them). Now that Android has good web accessibility support, it is time that we integrate with those features to offer our users the seamless experience they expect. In tomorrows Nightly, you will see a few improvements: TalkBack users could pinch-zoom the web content with three fingers, just like they could on Chrome (bug 1019432). The TalkBack local context menu has more options that users expect with web content, like section, list, and control navigation modes (bug 1182222). I am proud of our section quick nav mode, I think it will prove to be pretty useful. We integrate much better with the system cursor highlight rectangle (bug 1182214). The TalkBack scroll gesture works as expected. Also, range widgets can be adjusted with the same gesture (bug 1182208). Improved BrailleBack support (bug 1203697). That’s it, for now! We hope to prove our commitment to accessibility as Firefox branches out to other platforms. Expect more to come. [Less]
Posted almost 9 years ago
Every committed Mozillian and many enthusiastic end-users will use a pre-release version of Firefox. In Mac and Windows this is pretty straightforward, you simply download the Firefox Nightly/Aurora/Beta dmg or setup tool, and get going. When it is ... [More] installed it is a proper desktop application, you could make it your default browser, and life goes on. In Linux, we rely much more on packagers to prepare an application for the distribution before we could use it. This usually works really well, but sometimes you really just want to use an upstream app without any gatekeepers. The pre-release versions of Firefox for Linux comes in tarballs. You unpack them, and could run them out of the unpacked directory. But it doesn’t run well. You can’t set them as your default browser, the icon is a generic square, and opening links from other apps is a headache. In short, it’s a less than polished experience. So here is a small script I wrote, it does a few things: It downloads the latest Firefox from the channel of your choosing. It unpacks it into a hidden directory in your $HOME It adds a symbolic link to the main executable in ~/.local/bin . It adds symbolic links for the icon’s various sizes into your icon theme in ~/.local/share/icons. It adds a desktop file to ~/.local/share/applications. It doesn’t require root privileges, and is contained to your home directory so it won’t conflict with the system Firefox installation or touch the system libxul. Typically, you only need to run the script once per channel. After a channel is installed, they will get automatic updates through the actual app. See the nice icon? So, here are some commands you could copy to your terminal and have pre-release Firefox installed: Nightly curl https://raw.githubusercontent.com/eeejay/foxlocal/master/foxlocal.py  | python - nightly Aurora curl https://raw.githubusercontent.com/eeejay/foxlocal/master/foxlocal.py  | python - aurora Beta curl https://raw.githubusercontent.com/eeejay/foxlocal/master/foxlocal.py  | python - beta Release curl https://raw.githubusercontent.com/eeejay/foxlocal/master/foxlocal.py  | python - release [Less]
Posted almost 9 years ago
Now that eSpeak runs pretty well in JS, it is time for a Web Speech API extension! What is the Web Speech API? It gives any website access to speech synthesis (and recognition) functionality, Chrome and Safari already have this built-in. This ... [More] extension adds speech synthesis support in Firefox, and adds eSpeak voices. For the record, we had speech synthesis support in Gecko for about 2 years. It was introduced for accessibility needs in Firefox OS, now it is time to make sure it is supported on desktop as well. Why an extension instead of built-in support? A few reasons: An addon will provide speech synthesis to Firefox now as we implement built-in platform-specific solutions for future releases. An addon will allow us to surface current bugs both in our Speech API implementation, and in the spec. We designed our speech synthesis implementation to be extensible with addons, this is a good proof of concept. People are passionate about eSpeak. Some people love it, some people don’t. So now I will shut up, and let eSpeak do the talking: Download the latest version. Check out the source. Visit this speech synthesis demo page. [Less]
Posted about 9 years ago
td;dr Look! A flashy demo with buttons! Background A long time ago, we were investigating a way to expose text-to-speech functionality on the web. This was long before the Web Speech API was drafted, and it wasn’t yet clear what this kind of feature ... [More] would look like. Alon Zakai stepped up, and proposed porting eSpeak to Javascript with Emscripten. This was a provocative idea: was our platform powerful enough to support speech synthesis purely in JS? Alon got back a few days later with a working demo, the answer was “yes”. While the speak.js port was very impressive, it didn’t answer many of our practical needs. For example, the latency was not good enough for making a responsive UI, you could wait more than a couple of seconds to hear a short phrase. In addition, the longer the text you wanted to synthesize, the longer you needed to wait. It proved a concept, but there were missing pieces we didn’t have four years ago. Today, we live in the future of 2011, and things that were theoretical then, are possible now (in the future). asm.js Today, Emscripten will compile C/C++ code into a subset of Javascript called asm.js. This subset is optimized on all current browsers, and allows performance to be about 2x native. That is really good. eSpeak is a pretty lightweight library already, the extra performance boost of asm.js makes speech instantaneous. Transferable Objects Passing data between a web worker and a parent process used to mean a lot of copying, since the worker doesn’t share memory with the parent process. But today, you can transfer ownership of ArrayBuffers with zero copying. When the web worker is ready to send audio data back to the calling process, it could do so while maintaining a single copy of the audio buffer. Web Audio API We have a slick, full featured Audio API today on the web. When speak.js came out in 2011, it used a prefixed method on an element to write PCM data to. Today, we have a proper API that enables us to take the audio data and send it through an elaborate pipeline of filters and mixers, or even send it into the ether with WebRTC. Emscripten Got Fancy This was my first time playing with it, so I am not sure what was available in 2011. But, if I have to guess, it was not as powerful and fun to work with. Emscripten’s new WebIDL support makes adding bindings extremely easy. You still get a chance to do some pointer arithmetic, but that’s supposed to be fun. Right? So here is eSpeak.js! I wanted to do a real API port, as opposed to simply porting a command line program that takes input and writes a WAV file. Why? two main reasons: eSpeak can progressively synthesize speech. If you provide a callback to espeak_Synth(), it will be called repeatedly with as many samples as you defined in the buffer size. It doesn’t matter how long the text is that you want synthesized, it will fill the buffer and return it to you immediately. This allows for a consistent low latency from the moment you call espeak_Synth(), until you could start playing audio. eSpeak supports events. If you use a callback, you get access to a list of events that provide a timestamp in the audio, and the type of event that occurs there, such as word or sentence boundaries. And, of course, with all the recent-ish platform improvements above, I was really time for a fresh attempt. Future Work Break up the data files. Right now, eSpeak.js is over a 2MB download. That’s because I packaged all the eSpeak data files indiscriminately. There may be a few bits that are redundant. On the flip side you get all 99 voice/language combinations (that’s a good deal for 2MB, eh?). It would be cool to break it up to a few data files and allow the developer to choose which voices to bundle or, even better, just grab them on demand. Make a demo of the speech events. It makes my head hurt to think about how to do something compelling. But it is a neat feature that should somehow be shown. ScriptProcessorNode is apparently deprecated. This is going to need to be ported to an AudioWorker once that is widely implemented. I’m done apologizing, here is the demo. [Less]
Posted over 9 years ago
The Internet is a global public resource that must remain open and accessible. — Mozilla manifesto Mozilla invests in accessibility, because it’s the right thing to do. We have staff, a team of engineers, who exclusively focus on accessibility in ... [More] our products and play a positive influence in the general accessibility of the web. This has paid off well, Firefox is well regarded as a leader in screen reader support on the desktop and on Android. We have the best HTML5 accessibility support in our browser, and we are close to having a fully functional screen reader in Firefox OS. I say “close”, because we are not yet there. Most websites are fairly accessible with little to no effort from the site developers. The document model of the web is relatively simple and is malleable enough that blind users are able to access them through screen readers. Advanced web applications are a whole other story, developers are required to be much more mindful about how they are authored and account for users with disabilities when designing them. The most recognized standard for making accessible rich internet application is called ARIA (accessible rich internet applications), and it allows augmenting markup with attributes that will help assistive technologies (such as screen readers) have a good understanding of the state of the app, and relay it to the user. In Firefox OS we have a suite of core apps called Gaia that is the foundation for Firefox OS’s user interface. It is really one giant web app, perhaps one of the biggest out there. Since our mission dictates that we make our products accessible, we have embarked on that journey, we created a screen reader for Firefox OS, and we got to work in making Gaia screen-reader friendly. It has been a long and sisyphean process, where we would arrive at one module in gaia, learn the code, fix some issues, and move on to the next module. It feels something like this: A California Department of Forestry helicopter dumps water on a grass fire in Benicia. (Robinson Kuntz/Daily Republic) Firefox OS has grown tremendously in a couple of years. Things never slowed down, and we were always revamping one app or another, trying out something new, and evolving rapidly. This means that accessibility was always one step behind. If we got an app accessible in version n, n+1 was around the corner with a whole new everything. Besides working on Gaia, we have always been looping back to our screen reader, making it more robust and adding features. We have consistently been straddling the gap: Firefox OS has achieved some amazing milestones in its short life. Early in the project, there was still a hushed uncertainty. Did we over promise? Could we turn a proof of concept into a mass-market device? There were so many moving parts for a version one release. Accessibility was not a product priority. The return on investment When I think about making our products accessible for the people that can’t see or to help a kid with autism, I don’t think about a bloody ROI. — An angry Tim Cook Take 5 seconds, and let that sink in. Apple is not a charity, they are one of the most profitable companies on the planet. Still, they understand the social value of making their products accessible. Yet, I will argue that there is a bloody return on investment in accessibility. Mobile is changing our social perception on disability and blurring the line between permanent and temporary barriers. The prevailing assumption used to be that your user will sit in front of a 14″ monitor with a keyboard, mouse and an undivided attention. But today there can be no assumptions, an app needs to be usable in many situations that impair the user in comparison to a desktop setup: A user will browse the web on a small, 3.5″ device with no keyboard, and only their inaccurate fat fingers as a pointing device for activating links. A driver will need to keep their eyes on the road and cannot interact with complex interfaces. A cyclist on a cold winter day will have gloves and will want to look up where they are going on a map. A pedestrian will look up a nearby restaurant on a sunny day with plenty of glare making it hard to read their phone’s screen. This shouldn’t happen. The edge case of permanently impaired users is eclipsed by the common mobile use case which needs to appeal to users with all sorts of temporal impairments: motor, visual and cognitive. Apple understands that with Siri, and Google does too with Google Now. In Firefox OS, sooner or later we will need a good voice input/output story. I made a case for accessibility, and I could probably stop here. But I won’t. Because the real benefit of an accessible device is priceless. While blind smart phone users are a small fraction of the general population, the impact on their lives is so much greater. We all benefit from that smart phone in our pocket. The first iPhone was a real revolution. It allows us to check mail on the go, share our lives on social networks, ignore our family, and pretend like we we are doing something important in awkward parties. But for blind users, smart phones have increased their quality of life in profound and amazing ways. Blind smart phone owners are more independent, less isolated. and they can participate in online life like never before. Prior to smart phones, blind folks depended on very expensive gadgets for mobile computing. Today, a smart phone with a few handy apps could easily replace a $10,000 specialty device. Smart phones in the hands of blind users is a very big deal. What we need to do To make this happen, every decision by our product team, every design from UX, and every line of code from developers needs to account for the blind user experience. This isn’t as big a deal as it sounds, screen readers support is just another thing to account for, like localization. We know today that designing and developing UI for right-to-left languages take some consideration. Especially if you live in a left-to-right world. What we need is project-wide consciousness around accessibility. It is great that we have an accessibility team, and I think Mozilla benefits from it. But this does not let anyone else off the hook from understanding accessibility, embedding it in our products, and embracing it as a value. I fear that this post will disappoint because I won’t get into how blind users use smart phones, and how should developers account for the screen reader. I have written in the past about this, and Yura has some good posts on that as well. And yes, we need to step up our game, document and communicate more. But for now, here are two things you could do to get a better picture: If you own an Android device or iPhone, turn on the screen reader, close your eyes and learn to use it. Challenge yourself to complete all sorts of tasks with your screen reader on. Test the screen readers limits. With your Firefox OS device, turn on the screen reader. It works in the same fashion as the iOS or Android one does. Check your latest creation, and see what is broken and missing. 2015 is going to be a great year for Firefox OS. I have already heard all sorts of product ideas that have the potential of greatness. We are destined to ship something amazing. But for bind users, it could be life changing. [Less]
Posted over 9 years ago
The Internet is a global public resource that must remain open and accessible. — Mozilla manifesto Mozilla invests in accessibility, because it’s the right thing to do. We have staff, a team of engineers, who exclusively focus on accessibility in ... [More] our products and play a positive influence in the general accessibility of the web. This has paid off well, Firefox is well regarded as a leader in screen reader support on the desktop and on Android. We have the best HTML5 accessibility support in our browser, and we are close to having a fully functional screen reader in Firefox OS. I say “close”, because we are not yet there. Most websites are fairly accessible with little to no effort from the site developers. The document model of the web is relatively simple and is malleable enough that blind users are able to access them through screen readers. Advanced web applications are a whole other story, developers are required to be much more mindful about how they are authored and account for users with disabilities when designing them. The most recognized standard for making accessible rich internet application is called ARIA (accessible rich internet applications), and it allows augmenting markup with attributes that will help assistive technologies (such as screen readers) have a good understanding of the state of the app, and relay it to the user. In Firefox OS we have a suite of core apps called Gaia that is the foundation for Firefox OS’s user interface. It is really one giant web app, perhaps one of the biggest out there. Since our mission dictates that we make our products accessible, we have embarked on that journey, we created a screen reader for Firefox OS, and we got to work in making Gaia screen-reader friendly. It has been a long and sisyphean process, where we would arrive at one module in gaia, learn the code, fix some issues, and move on to the next module. It feels something like this: A California Department of Forestry helicopter dumps water on a grass fire in Benicia. (Robinson Kuntz/Daily Republic) Firefox OS has grown tremendously in a couple of years. Things never slowed down, and we were always revamping one app or another, trying out something new, and evolving rapidly. This means that accessibility was always one step behind. If we got an app accessible in version n, n+1 was around the corner with a whole new everything. Besides working on Gaia, we have always been looping back to our screen reader, making it more robust and adding features. We have consistently been straddling the gap: Firefox OS has achieved some amazing milestones in its short life. Early in the project, there was still a hushed uncertainty. Did we over promise? Could we turn a proof of concept into a mass-market device? There were so many moving parts for a version one release. Accessibility was not a product priority. The return on investment When I think about making our products accessible for the people that can’t see or to help a kid with autism, I don’t think about a bloody ROI. — An angry Tim Cook Take 5 seconds, and let that sink in. Apple is not a charity, they are one of the most profitable companies on the planet. Still, they understand the social value of making their products accessible. Yet, I will argue that there is a bloody return on investment in accessibility. Mobile is changing our social perception on disability and blurring the line between permanent and temporary barriers. The prevailing assumption used to be that your user will sit in front of a 14″ monitor with a keyboard, mouse and an undivided attention. But today there can be no assumptions, an app needs to be usable in many situations that impair the user in comparison to a desktop setup: A user will browse the web on a small, 3.5″ device with no keyboard, and only their inaccurate fat fingers as a pointing device for activating links. A driver will need to keep their eyes on the road and cannot interact with complex interfaces. A cyclist on a cold winter day will have gloves and will want to look up where they are going on a map. A pedestrian will look up a nearby restaurant on a sunny day with plenty of glare making it hard to read their phone’s screen. This shouldn’t happen. The edge case of permanently impaired users is eclipsed by the common mobile use case which needs to appeal to users with all sorts of temporal impairments: motor, visual and cognitive. Apple understands that with Siri, and Google does too with Google Now. In Firefox OS, sooner or later we will need a good voice input/output story. I made a case for accessibility, and I could probably stop here. But I won’t. Because the real benefit of an accessible device is priceless. While blind smart phone users are a small fraction of the general population, the impact on their lives is so much greater. We all benefit from that smart phone in our pocket. The first iPhone was a real revolution. It allows us to check mail on the go, share our lives on social networks, ignore our family, and pretend like we we are doing something important in awkward parties. But for blind users, smart phones have increased their quality of life in profound and amazing ways. Blind smart phone owners are more independent, less isolated. and they can participate in online life like never before. Prior to smart phones, blind folks depended on very expensive gadgets for mobile computing. Today, a smart phone with a few handy apps could easily replace a $10,000 specialty device. Smart phones in the hands of blind users is a very big deal. What we need to do To make this happen, every decision by our product team, every design from UX, and every line of code from developers needs to account for the blind user experience. This isn’t as big a deal as it sounds, screen readers support is just another thing to account for, like localization. We know today that designing and developing UI for right-to-left languages take some consideration. Especially if you live in a left-to-right world. What we need is project-wide consciousness around accessibility. It is great that we have an accessibility team, and I think Mozilla benefits from it. But this does not let anyone else off the hook from understanding accessibility, embedding it in our products, and embracing it as a value. I fear that this post will disappoint because I won’t get into how blind users use smart phones, and how should developers account for the screen reader. I have written in the past about this, and Yura has some good posts on that as well. And yes, we need to step up our game, document and communicate more. But for now, here are two things you could do to get a better picture: If you own an Android device or iPhone, turn on the screen reader, close your eyes and learn to use it. Challenge yourself to complete all sorts of tasks with your screen reader on. Test the screen readers limits. With your Firefox OS device, turn on the screen reader. It works in the same fashion as the iOS or Android one does. Check your latest creation, and see what is broken and missing. 2015 is going to be a great year for Firefox OS. I have already heard all sorts of product ideas that have the potential of greatness. We are destined to ship something amazing. But for bind users, it could be life changing. [Less]