Week 10

This last week was extremely busy! At the beginning of the week I ran two final tests. At the same time I started analyzing our data that we already had. We organized all of the data by participant first, then we created google sheets to calculate the correlations for the data and the averages. It all took a while to figure out and to get all the data in the same format. After we finished the last test, we finally input all of the participant data and created the graphs that would be used for our poster, presentation, and paper. I worked on writing our paper, helped make our poster, and helped make our presentation on Wednesday. After the presentation on Wednesday the last step was to do the finishing touches on our paper. The internship all finished up so fast it’s weird to think that it’s over! I’m so grateful for this experience, I really liked learning about caption metrics and about how to research for the first time.

Read More

Week 9

This week we got 7 new participants which is good progress! All of our meetings with participants so far have been good and without problems. We noticed that the responses for question 3 were odd. Question 3 asked participants to rate the quality of the error clip after they watched both the error and accurate clips, so we expected quality ratings to decrease as participants recognized more errors. It was actually the opposite, as participants gave number 3 relatively high numbers. We think that the question may have been unclearly worded, and participants were rating the accurate clip instead of the error clip. So the bad news of this week is that we won’t be able to use the data from question three.

Read More

Week 8

I worked on recruiting participants for our study this week, and ran a few tests. I found participants from friends and friends of friends. I also posted in more Facebook groups. We also determined that we would need to change our experiment and throw away our last four tests. Luckily we were able to change the captions on the Youtube videos without having to re upload any videos. Next week I look forward to scheduling even more studies.

Read More

Week 7

This past week, I helped to create a new version of our experiment. Our pilot studies had been going well, but one pilot study participant suggested that we embed our videos into a survey instead of showing the videos one at a time and collecting data through a repeating survey after each video. I spent a lot of time looking at different survey websites, and determining if any would work for us. At first, I built a survey on SurveyMonkey, but I then found out that Survey Monkey requires that you sign up for a paid plan. We had to scrap that, but then I realized that our videos could be input into the google form, we just needed to do it through YouTube. The downside is that you need to navigate away from the google form in order to watch the video in fullscreen on YouTube, but the upside is that it’s free and easily added onto our existing survey! We ran some actual studies using the new survey, and they all went well. However, we realized that the error and accurate captions are different in style to the point where it may skew our results. Next week we need to determine what to do about this and if we need to restart our stimuli clips.

Read More

Week 6

This week was a lot for me, especially because I also moved back from NY to California on Tuesday. I worked on fixing the delay in all of the rest of the videos, and that took a few iterations of edits. I helped out with our methods presentation on Thursday, and I felt we did pretty well. We were a little behind on getting our pilot testing ready, but we finally were able to today on Friday. Our pilot testing went well, we tested with both a hearing and a deaf participant. We got some feedback regarding the clarity of instructions, which was helpful. We also both realized that our captions were not the most up to date and we needed to redownload them from Google Drive. Next week, I’m looking forward to starting our actual testing if we have people signed up and ready by then. We’re also planning to do another pilot test before that, which should go pretty smoothly.

Read More

Week 5

This week we worked on finalizing everything we need to do pilot testing. I worked primarily on fixing the delay in all of the caption error video clips, which took some time because I had to get used to using the CADET software. We also made some changes to our flyer and to our pre and post questions, in order to get everything more ready and move forward with our experiment. Since this week was shorter, I only got done with a few delay fixes, but I feel confident that I can fix many more much quicker in the future. Next week I plan to fix all of the rest of the delays and run the pilot study, as well as finish up our presentation.

Read More

Week 4

I enjoyed this week, mainly because we got to make many big picture decisions regarding how our experiment will run and what will work best in order to get accurate results. I felt like a researcher for the first time! I also had fun with the details in our work, such as watching live TV with captions and looking for the errors. This task was a lot harder than it seemed, because the captions were so extremely delayed that I had to remember what I was hearing in the audio for 5 seconds before I would verify when the captions would finally pop up, all the while listening to new audio. It was like a mental exercise. I had to replay many clips to make sure I understood what mistakes were occurring so that I could classify them. I felt that this was a pretty successful technique, and I’m excited to continue doing this next week to gather a few more clips to test with subjects. Next week, I will also continue to write our methods for our typed paper, as I finally finished our typed introduction and background sections. We also finally heard back from NCAM, and figured out how WWER weights would be calculated. I made sure to include this in the background of our paper. I will also help finalize our participant flyer, and I think we will probably start advertising our study next week as well, depending on how the IRB review goes. We also will need to start looking at the code scripts next week and seeing how to evaluate each metric for each clip we have chosen. I am excited to see how all this goes, and hopefully the code doesn’t give us too trouble.

Read More

Week 3

This week I worked a lot on summarizing and presenting our research. I went on a wild goose chase to find the 17 weighted errors of WWER, but they weren’t anywhere on the internet. Hopefully NCAM gets back to us next week so that we can use WWER in our experiment. I also helped out with our presentation, and I felt that it went pretty well for us. What has taken most of my time has been sifting through all the information we found in our literature review, and organizing it in a way that makes sense for our typed literature review background. It has been challenging but rewarding to write all our research out in a way that is presentable and nice. My goal is that eventually the background information we write out makes sense and sounds very scholarly. Next week, I’ll finish that up first, and then work on writing out a descriptive methods plan for our experiment. I also want to help write a code that will evaluate all our videos’ caption metrics, and see which metrics the most skewed ratio. Those videos will be the most interesting to test, because they’ll show which metric is more accurate the most clearly.

Read More

Week 2

My work this week was centered around live captioning and the different types of caption metrics out there. I feel that I made good progress in solidifying our research question and working out any uncertain details surrounding it. At the start of the week, our research question was “How well does the WER/WWER assess quality compared to the NER model? How should we define quality? (user experience, industry applicability, etc)”, and by the end of the week it was “How well/accurately do the WER/WWER/NER/ACE models assess quality compared to each other?” I focused on NER captioning at first and researched that extremely in depth, but I found out you need to be trained to evaluate NER, so I left that topic alone to research other metrics. I found out about ACE, and looked into that. In trying to make sure our category of research hasn’t already been done, I looked for research out there that is similar to our research question. I found one article that compared the WER and ACE metrics already and found that ACE did better in comparison to WER, but I learned that metrics being used on videos is much different than metrics being used on live TV, so our research question is still new and valid. Next week, I’m excited to look more into how we’ll set up our procedures, and to get into any coding that we will need to do for the project.

Read More

Week 1

This first week has been very interesting and exciting for me. At first, I was a little underwhelmed by the project, because I thought that doing lit review this week would not be as fun as a hands-on project. But, as I read more and more articles, I have been getting more and more excited about this topic. What interests me the most so far is how researchers choose to set up experiments. I’ve learned about this for the first time through both the CITI training and the academic literature. There is a lot more careful consideration that goes into setting up experimentation than I thought, such as navigating limitations, protecting subjects’ autonomy and privacy, and coordinating with these factors to make a fruitful experiment. I read one article that was about an experiment performed to study where people’s eyes look and for how long when watching movies with captions. It was interesting to see that even though the edited captions took less time to read, and would theoretically be better because they would grant the viewer more time to look at visuals, most DHH individuals prefer verbatim captions. The reason has to do with clarity and comprehension. How captions and caption metrics affect comprehension is something that I’d like to research more. Furthermore, I need to research how exactly captions are edited and what qualifies as edited, verbatim, or near-verbatim. I also need to look into WER and WWER calculations, and how those are facilitated on Jupyter, for next week.

Read More