Unicode support for avformat_open_input in Windows

For those of us ever writing cross-platform application there has always been enough quires and quests to accomplish. Typical one is to correctly handle multibyte/unicode filepaths in Windows. And though they are handled pretty good in Qt, when you write your own library you have to do it yourself.

Another level of quests is using third-party libraries which were not designed for cross-platform usage. For example if you wanted to use ffmpeg / libav libraries in Windows, you have to deal with lack of support of std::wstring parameters in the API. One way to deal with it – arrange a custom IO using AVFormatContext and handle file paths by yourself. I have found a wonderful article and code example of how to do it in the blog of Marika Wei. Slightly adapted, the solution will handle all Windows paths


struct {
#ifdef _WIN32
    std::wstring m_FilePath;
#else
    std::string m_FilePath;
#endif
    AVIOContext *m_IOCtx;
    uint8_t *m_Buffer; // internal buffer for ffmpeg
    int m_BufferSize;
    FILE *m_File;
}

#ifdef _WIN32
    m_File = _wfopen(m_FilePath.c_str(), L"rb");
#else
    m_File = fopen(m_FilePath.c_str(), "rb");
#endif

m_IOContext = avio_alloc_context(
    m_Buffer, m_BufferSize, // internal buffer and its size
    0, // write flag (1=true, 0=false)
    (void*)this, // user data, will be passed to our callback functions
    IOReadFunc,
    0, // no writing
    IOSeekFunc
);

Check out the full code at GitHub.

How to pass Amazon SDE interview

Amazon is considered to be one for the most wanted employers among software engineers who don’t work for any of the tech giants. Standing in one line with Google, Microsoft, Facebook and maybe some smaller like Twitter, Uber, Dropbox etc., it has unstoppable flow of CV’s from people passionate of working on big scale.

But is it really that cool, demanding and, in the end, rewarding? A lot of people would disagree with that, others will be neutral and there will be only few of those who will agree. For example, typical everyday job of server-side SDE II responsible for customer experience with purchasing goods can only consist of sending/receiving requests to/from internal web-services, validating input data, fixing small bugs and that’s all. Oh no, there’s one more thing – on-call rotations. So one week every few months (that depends on a team, but just to give you an idea) that employee despite of his “interesting and challenging” duties will be responsible for fixing bugs on production asap which literally means ASAP – during the weekend, in the evening, in the night – doesn’t matter.

That is why Amazon looks for people who won’t whine about such lifestyle. Amazon has a dozen of so-called “principles” (read “search criteria for new employees”) where some are contradictory to the others. Like they need employees who have a “bias for action” but are “insisting on highest standards” or who are “frugal” but “think big” and stuff like that. Interviewers will ask you about how do you match with these principles and what they’re really interested in is if you had experience working overtime, on the weekends, under pressure, overnight – in order to deliver results in short terms and fix bugs. They clearly tell you about it – if you’re weak in programming or algorithms – it does not matter if on the other hand you’re used to working overtime just to deliver results.

So how to pass Amazon interview? They will ask you about your experience and definitely will ask you to give them example where you had tight deadlines and half-finished task. They want to hear how did you work overnights and did not complain for that. If they will – you’ve passed even if your solution for their O(N^2) dynamic programming puzzle is NP-complete.

Replacing QNetworkAccessManager for the great good

Everybody using Qt for networking for small tasks will sometimes face oddities of QNetworkAccessManager. This class aims to be useful and convenient while having few quite sensible drawbacks. First one of couse is inability to use it in blocking way. What you should do instead is to create instance of QEventLoop and connect it’s quit() signal with network manager.

QNetworkAccessManager networkManager;
QEventLoop loop;
QNetworkReply *netReply = networkManager.get(resource);
connect(netReply, SIGNAL(finished()), &loop, SLOT(quit()));
loop.exec();    

This is overkill and overengineering of course. This inconveniency strikes also when you try to use it from background thread for downloading something – QNetworkAccessManager needs an event loop and it will launch one more thread – it’s own to do all the operations required.

Also it has a lot of data, methods and abilities not needed for “everyday simple network operations” like querying some API or downloading files. I don’t know anybody who wasn’t looking for a substitude for it at least once. But fortunately the solution exists.

Continue reading Replacing QNetworkAccessManager for the great good

Resources to learn and understand parallel programming. The hard way

There’s no way other than the hard way. (c)

Parallel programming is considered as not easy or even advanced topic by many programmers. It’s the starting point for even more advanced stuff like distributed computations, reliability, CAP theorem, consensus problems and much more. Besides, deep understanding of how CPU and operating system works can help you to write less buggy software and parallel programming can help you with that too.

In this post I will focus on books describing parallel programming using 1 computer and 1 CPU using classical approaches. Neither they contain SSE instructions guides nor you will find matterials on CUDA or OpenCL. Similary you will find no resourced about Hadoop and/or MapReduce technologies and nothing about technologies supporting parallel programming out of the box like Go or Erlang.

So I will go now through all the resources which I find more or less useful. I’m not going to stick to any technology in general – the point is to understand the topic from different perspectives. The materials I’m refering to in general should not be considered as entry-level –  they require fair amount of knowledge, but nevertheless, list goes sorted starting from “easier” things.

Continue reading Resources to learn and understand parallel programming. The hard way

Implementing spellchecking in desktop application in C++

When user is supposed to enter significant amount of text in your application, it’s better to help him/her to control it with checking spelling. Basically, to check spelling you need a dictionary with words and algorithm to order these words. Also it might be useful to provide user with possible corrections for any spelling error. Here where Hunspell comes handy. It’s an open source library built on top of MySpell library and used in a significant number of projects varying from open source projects like Firefox to proprietary like OS X. It contains bindings to a number of platforms (.NET, Ruby etc.) and should be fairly easy to integrate to your project. In this post I’ll discuss how to integrate it to C++/Qt project.

Continue reading Implementing spellchecking in desktop application in C++

Classic Producer-Consumer in Qt/C++

Producer-Consumer is a classic pattern of interaction between two or more threads which share common tasks queue and workers who process that queue. When I came to similar task first I googled for standard approaches in Qt to solve this problem, but they were based on signals/slots plus synchronization primitives while I wanted simple and clear solution. Of course, in the end I’ve invented my own wheel and I invite you to take a look at it.

For the synchronization in Producer-Consumer it’s useful to use Mutex and some kind of WaitingEvent for synchronous waiting until mutex is acquired. In Qt you have QMutex and QWaitCondition which are all that we need.

Let’s suppose we have following data structures:

        QWaitCondition m_WaitAnyItem;
        QMutex m_QueueMutex;
        QVector<T*> m_Queue;

where T is type of messages we’re producing/consuming. So we have queue of elements being processed, mutex to secure access to the queue and wait condition to wait if the queue is empty.

For Producer-Consumer usually we need methods produce() and consume(). Let’s see how we can implement them.

Continue reading Classic Producer-Consumer in Qt/C++

Implementing autocomplete for English in C++

When it comes to implementing autocompletion in C++ in some type of input field, the question is which algorithm to choose and where to get the source for completion. In this post I’ll try to answer both questions.

As for the algorithm, SO gives us hints about tries, segment trees and others. You can find good article about them. Author has implemented some of them in a repository called FACE (fastest auto-complete in the east). You can easily find it on GitHub. This solution is used for the autocompletion in search engine Duck-Duck-Go which should tell you how good it is. Unfortunately their solution requires dependencies on libuv and joyent http-parser, which is not good in case you need just to integrate autocompletion functionality into your C++ application, but not build auto-complete server and send queries to it. Another drawback – libuv and cpp-libface itself fails to compile in Windows which is bad in case you’re building cross-platform solution.

You can find out how to built FACE into your cross-platform C++ application below.

Continue reading Implementing autocomplete for English in C++

Handling drag’n’drop of files in Qt under OS X

If you ever tried to handle drag’n’drop files in your Qt application, you would usually come up with the code like the following.
First of all you will need a Drop Area somewhere in your application, which will handle drops

DropArea {
  anchors.fill: parent
  onDropped: {
    if (drop.hasUrls) {
      var filesCount = yourCppModel.dropFiles(drop.urls)
      console.log(filesCount + ' files added via drag&drop')
    }
 }
}

Where yourCppModel is a model exposed to Qml in main.cpp or wherever like this:

QQmlContext *rootContext = engine.rootContext();
rootContext->setContextProperty("yourCppModel", &myCppModel);

and int dropFiles(const QList<QUrl> &urls) is just an ordinary method exposed to QML via Q_INVOKABLE attribute.

You will sure notice everything works fine unless you’re working under OS X. In OS X instead of QUrls to local files you will get something like this: file:///.file/id=6571367.2773272/. There’s a bug in Qt for that and it even looks closed, but it still doesn’t work for me that’s why I’ve implemented my own helper using mixing of Objective-C and Qt-C++ code.

Continue reading Handling drag’n’drop of files in Qt under OS X

How to download huge folder from Dropbox

If you face a problem to download folder from Dropbox which contains tons of files, no known browser extension can help you. Dropbox moves each file download to it’s separate page and you can’t do it directly.

When I faced this problem I knew I would need to create my own solution and quick googling just confirmed that.

I opened javascript console and extracted all links from the folder. Then I replaced “dl=0” to “dl=1” to get actual download link.

var links = document.querySelectorAll("div.filename a")
var processed = Array.prototype.map.call(links, 
  function(link) { 
    return link["href"].replace("dl=0", "dl=1"); 
})
console.log(processed.join("\n"))

After I copied those to file links_to_download. If your processed array is too big, you can print it to console by chunks, using slice(start, end) method from Javascript. Now the problem is to download them.

I came up with wget for such problem. Linux and OS X users should have wget available (OS X users can install it via e.g. homebrew). Windows users have to download it separately, install and add to the PATH environmental variable. Additionaly I used –trust-server-names and –content-disposition parameters to save real filenames instead of dropbox hashed url. Then I faced a problem that it fails to download a file on first request and request timeout is quite big so I’ve set it to 5 seconds. Now it makes several timed-out requests, but they quickly resolve to the successful one.

wget --content-disposition --trust-server-names 
  --timeout=5 -i links_to_download

Also in order to download via https in Windows you probably will need to use “–no-check-certificate“.

Tips and tricks to improve performance of your ACM solution

Here I gathered system-programming tricks that can improve performance of your solution in C++ dramatically!

  • Use scanf/printf functions for standard IO instead of cin/cout
  • Memory-align buffers and structures to WORD size of your architecture (4 bytes for 32-bit and 8 bytes for 64-bit)
  • Use arrays instead of linked lists (to use memory block caching)
  • Avoid “if” stamements in loops
  • If-close should contain code, which is more likely to execute (if-condition == true)
  • Use inlining for short functions
  • Use objects allocated on stack but not on heap (local objects for functions instead of allocated with malloc/new)
  • Use pre-calculated hardcoded data (e.g. you can store first N prime numbers or first N Fibonacci numbers in order not to calculate them every time you need one)